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ABSTRACT 

Context. Thanks to the large collecting area (3 x ~1500 cm^ at 1.5 keV) and wide field of view (30' across in full field mode) of 
the X-ray cameras on board the European Space Agency X-ray observatory XMM-Newton, each individual pointing can result in the 
detection of up to several hundred X-ray sources, most of which are newly discovered objects. Since XMM-Newton has now been in 
orbit for more than 15 years, hundreds of thousands of sources have been detected. 

Aims. Recently, many improvements in the XMM-Newton data reduction algorithms have been made. These include enhanced source 
characterisation and reduced spurious source detections, refined astrometric precision of sources, greater net sensitivity for source 
detection, and the extraction of spectra and time series for fainter sources, both with better signal-to-noise. Thanks to these en¬ 
hancements, the quality of the catalogue products has been much improved over earlier catalogues. Furthermore, almost 50% more 
observations are in the public domain compared to 2XMMi-DR3, allowing the XMM-Newton Survey Science Centre to produce a 
much larger and better quality X-ray source catalogue. 

Methods. The XMM-Newton Survey Science Centre has developed a pipeline to reduce the XMM-Newton data automatically. Using 
the latest version of this pipeline, along with better calibration, a new version of the catalogue has been produced, using XMM-Newton 
X-ray observations made public on or before 2013 December 31. Manual screening of all of the X-ray detections ensures the highest 
data quality. This catalogue is known as 3XMM. 

Results. In the latest release of the 3XMM catalogue, 3XMM-DR5, there are 565962 X-ray detections comprising 396910 unique 
X-ray sources. Spectra and lightcurves are provided for the 133000 brightest sources. For all detections, the positions on the sky, a 
measure of the quality of the detection, and an evaluation of the X-ray variability is provided, along with the fluxes and count rates in 
7 X-ray energy bands, the total 0.2-12 keV band counts, and four hardness ratios. With the aim of identifying the detections, a cross 
correlation with 228 catalogues of sources detected in all wavebands is also provided for each X-ray detection. 

Conclusions. 3XMM-DR5 is the largest X-ray source catalogue ever produced. Thanks to the large array of data products associated 
with each detection and each source, it is an excellent resource for finding new and extreme objects. 

Key words. Catalogs - Astronomical data bases - Surveys - X-rays: general 


* Based on observations obtained with XMM-Newton, an ESA sci¬ 
ence mission with instruments and contributions directly funded by 
ESA Member States and NASA. 

** http://cdsarc.u-strasbg.fr/viz-binWizieR?-meta.foot&- 
source=IX/44 


1. Introduction 

XMM-Newton (iJansen et al.l l2001h is the second cornerstone 
mission from the European Space Agency Horizon 2000 pro¬ 
gramme. It was launched in December 1 999, and thanks to the 
~1500 cm^ of geometric effective area (iTurner et al.l 1200 iT) for 
each of the three X-ray telescopes aboard, it has the largest ef- 
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fective area of any X-ray satellite (lLonginottill2014l) . This fact, 
coupled with the large held of view (FOV) of 30', means that a 
single p ointing on average d etects 50 to 100 serendipitous X-ray 
sources (IWatson et al.ll200^ . 

For the past 19 years, the XMM-Newton Survey Sci- 
erice Centr^ (SSC) , a consortium of ten European Institutes 
dWatson et al.ll200lli has developed much of \h& XMM -Newton 
Science Analysis Software (SAS) (iGabriel et al.l l2004ll for re¬ 
ducing and analysing XMM-Newton data and created pipelines 
to perform standardised routine processing of the XMM-Newton 
science data. The XMM SSC has also been responsible for pro¬ 
ducing catalogues of all of the sources detected with XMM- 
Newto n. The catalogues of X-ray sources detec ted with the three 
EPIC dStriider et alj 2001 at [Turner et al.ll200l|) cameras that are 
placed at the focal point of the three X-ray tel escopes have 
been designated IXMM and 2XMM successively dWatson et al.l 
1 20091) . with incremental versions of these catalogues indicated 
by successive data releases, denoted -DR in association with the 
catalogue number. This paper presents the latest version of the 
XMM catalogue, 3XMM. The original 3XMM catalogue was 
data release 4 (DR4). The publication of this paper coincides 
with the release of 3XMM-DR5. This version includes one ex¬ 
tra year of data and increases the number of detections by 7%, 
with respect to 3XMM-DR4. The number of X-ray detections 
in 3XMM-DR5 is 565962, which translate to 396910 unique X- 
ray sources. The median flux of these X-ray sources is ~2.4x 
10“'^^ erg cm“^ s“* (0.2-12.0 keV), and the data taken span 13 
years. The catalogue covers 877 square degrees of sky (~2.1% 
of the sky) if the overlaps in the catalogue are taken into ac¬ 
count. 3XMM-DR5 also includes a number of enhancements 
with respect to the 3XMM-DR4 version, which are described 
in appendix 0 The 3XMM-DR5 catalogue is approximately 
60% larger than the 2XMMi-DR3 release an d five times the cur - 
rent size of the Chandra source catalogue (lEvans et alJl2010l) . 
3XMM uses significant improvements to the SAS and incorpo¬ 
rates developments with the calibration. Enhancements include 
better source characterisation, a lower number of spurious source 
detections, better astrometric precision, greater net sensitivity 
and spectra, and time series for fainter sources, both with bet¬ 
ter signal-to-noise. These improvements are detailed throughout 
this paper. 

A complimentary catalogue of ultra-violet and optical 
sources detected w ith the XMM-Newton Optical Monitor (OM 
iMason et al.l l200l1) in similar fields to the XMM catalogue is 
also produced in the framework of the XMM-Newton SSC and 
is called the XMM-Newton Serendipitous Ultraviolet Source 
Survey (XMM-SUSS in its origina l form, with the more re¬ 
cent version named XMM-SUSS2, IPage et akl l2012h . 3XMM 
is also complementary to other recent X-ray catalogues, such 
as the Chandra source catalogue mentioned above, and the 
ISXPS (Swift-X-ray Telescope (XRT) point source) catalogue 
(lEvans et al.l 1201 4^ of 151524 X-ray point sources detected 
with the Swift-XRT over eight years of operation. ISXPS has 
a sky coverage nearly 2.5 times that of 3XMM, but the effec¬ 
tive area of the XRT is less th an a tenth of eac h of the tele¬ 
scopes on board XMM-Newton (lLonginottill2014t) . Other earlier 
catalogues inc lude all-sky cover age, such as the ROSAT all-sky 
survey (RASS IVoges et'51ll999h . but the reduced sensitivity of 
ROSAT compared to XMM-Newton means that the RASS cata¬ 
logue contains just 20% of the number of sources in 3XMM- 
DR4. However, the different X-ray source catalogues in conjunc¬ 
tion with 3XMM allow searches for long-term variability. 


* http://xmmssc.irap.omp.eu/ 


Whilst this paper covers the 3XMM catalogue in general, 
some of the data validation presented was carried out on the 
3XMM-DR4 version that was made public on 23 July 2013. 
3XMM-DR4 contains 531261 X-ray detections that relate to 
372728 unique X-ray sources taken from 7427 XMM-Newton 
observations. 

The paper is structured as follows. Section |2] contains in¬ 
formation concerning the observations used in the 3XMM-DR5 
catalogue. Section [3] covers the 3XMM data processing and 
details changes ma de with respect to previous catalogues (see 
I Watson et akl 1200^ . such as the exposure selection, the time- 
dependent Foresight implemented, the suppression of minimum 
ionising particle (MIP) events, the optimised flare Altering, the 
improved point spread function (PSP) used for the source detec¬ 
tion, new astrometric corrections and the newly derived energy 
conversion factors (ECPs). We also outline the new source flag¬ 
ging procedure. Section |4] covers the source specific products 
associated with the catalogue, such as the enhanced extraction 
methods for spectra and time series and the variability charac¬ 
terisation. Section |5] describes the various screening procedures 
employed to guarantee the quality of the catalogue, and Sec- 
tion|6]outlines the statistical methods used for identifying unique 
sources in the database. Then, Section|7]describes the procedures 
used to cross-correlate all of the X-ray detections with external 
catalogues. Section 0 discusses the limitations of the catalogue 
and Section |9] characterises the enhancement of this catalogue 
with respect to previous versions, with the potential of the cat¬ 
alogue highlighted by several examples of objects that can be 
found in 3XMM, in Section [10] Pinally, information on how 
to access the catalogue is given in Section [TT] and future cata¬ 
logue updates are outlined in Section[T2l before concluding with 
a Summary. 

2. Catalogue observations 

3XMM-DR5 is comprised of data drawn from 7781 XMM- 
Newton EPIC observations that were publicly available as of 
31 December 2013 and that processed normally. The Hammer- 
Aitoff equal area projection in Galactic coordinates of the 
3XMM-DR5 fields can be seen in Pig. [T] The data in 3XMM- 
DR5 include 440 observations that were publicly available at the 
time of creating 2XMMi-DR3, but were not included in 2XMMi- 
DR3 due to the high background or processing problems. All of 
those observations containing > Iks clean data (>1 ks of good 
time interval) were retained for the catalogue. Pig. shows the 
distribution of total good exposure time (after event Altering) for 
the observations included in the 3XMM-DR5 catalogue and us¬ 
ing any of the thick, medium or thin Alters, but not the open Al¬ 
ter. There are just three observations with one or more cameras 
configured with the open Alter. The number of the 7781 XMM- 
Newton observations included in the 3XMM-DR5 catalogue for 
each observing mode and each Alter is given in Table[T] Open Al¬ 
ter data were processed but not used in the source detection stage 
of pipeline process ing. The same XMM- Newton data modes were 
used as in 2XMM IWatson et al.l (l2009l) and are included in ap- 
pendixlBjof this paper, for convenience. 

The only signiflcant diflerence was the inclusion of mo¬ 
saic mode data. Whilst most XMM-Newton observations are per¬ 
formed in pointing mode, where the spacecraft is locked on 
to a Axed position on the sky for the entire observation, since 
revolution 1812 (30 October 2009), a speciflc mosaic observ¬ 
ing mode was introduced in which the satellite pointing direc¬ 
tion is stepped across the sky, taking snapshots at points (sub¬ 
pointings) on a user-specifled grid. Data from dedicated mo- 
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Fig. 1. Hammer-Aitoff equal area projection in Galactic coordinates of 
the 7781 3XMM-DR5 fields. 
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Fig. 2. Distribution of total good exposure time (after event filtering) 
for the observations included in the 3XMM-DR5 catalogue (for each 
observation the maximum time of all three cameras per observation was 
used). 


saic mode or tracking (mosaic-like) observations are recorded 
into a single observation data file (ODF) for the observation. 
In previous pipeline processing, the pipeline products from the 
small number of mosaic-like observations were generally gen¬ 
erated, at best, for a single sub-pointing only. This is because 
the pipeline filters data such that only events taken during an 
interval where the attitude is stable and centred on the nomi¬ 
nal observation pointing direction (within a 3' tolerance), are ac¬ 
cepted. Data from some, or all, of the other sub-pointings were 
thus typically excluded. During 2012, the XMM-Newton Science 
Operations Centre (SOC) devised a scheme whereby the parent 
ODF of a mosaic mode observation is split into separate ODFs, 
one for each mosaic sub-pointing. All relevant data are contained 
within each sub-pointing ODF and the nominal pointing direc¬ 
tion is computed for the sub-pointing. This approach is applied 
to both formal mosaic mode observations and those mosaic¬ 
like/tracking observations executed before revolution 1812. For 
a mosaic mode observation, the first 8 digits of its 10-digit obser¬ 
vation identifier (OBS_ID) are common for the parent observa¬ 
tion and its sub-pointings. However, while the last two digits of 
the parent observation OBS_ID almost always end in 01, for the 
sub-pointings they form a monotonic sequence, starting at 31. 
Mosaic mode sub-pointings are thus immediately recognisable 
in having OBS_ID values whose last two digits are > 31. 


To the pipeline, mosaic mode (and mosaic-like) observation 
sub-pointings are transparent. No special processing is applied. 
Each sub-pointing is treated as a distinct observation. Source de¬ 
tection is performed on each sub-pointing separately and no at¬ 
tempt is made to simultaneously fit common sources detected 
in overlapping regions of multiple sub-pointings. While simul¬ 
taneous fitting is possible, this aspect had not been sufficiently 
explored or tested during the preparations for the 3XMM cata¬ 
logues. 

There are 45 observations performed in the dedicated mo¬ 
saic mode before the bulk processing cut-off date of 8 Decem¬ 
ber 2012, of which 37 are included in 3XMM-DR5, see ap¬ 
pendix point 1. None of these was available for catalogues 
prior to 3XMM. In total, there are 356 processed mosaic sub¬ 
pointings in the 3XMM-DR5 catalogue. 


The data used for the 3XMM catalogues have been reprocessed 
with the latest version of the SAS and the most up to date cali¬ 
bration available at the time of the processing. The majority of 
the processing for 3XMM-DR5 was conducted during December 
2012/January 2013, with the exception of 20 observations pro¬ 
cessed during 2013. The SAS used was similar to SAS 12.0.1 
but included some upgraded tasks required for the pipeline. The 
SAS manifest for tasks used in the cat9.0 pipeline and the static 
set of current calibration files (CCFs) that were used for the bulk 
reprocessing are provided via a dedicated online webpag^^- 

There are 31 observations in 2XMMi-DR3 that did not make 
it in to 3XMM-DR5, mainly due to software/pipeline errors dur¬ 
ing processing. Typical examples of the latter problems are due 
to revised ODFs (e.g. with no useful time-correlation informa¬ 
tion), more sophisticated SAS software that identified issues 
hitherto not trapped, or issues with exposure corrections of back¬ 
ground flare light curves and pn time-jumps. 

The main data processing steps used to produ ce the 3XMM 
data p roducts were similar to those outlined in IWatson et al.l 
(l2009t) and described on the SOC webpagefl In brief, these 
steps were the production of calibrated detector events from the 
ODFs; identification of stable background time intervals; identi¬ 
fication of “useful” exposures (taking account of exposure time, 
instrument mode, etc.); generation of multi-energy-band X-ray 
images and exposure maps from the calibrated events; source 
detection and parameterisation; cross-correlation of the source 
list with a variety of archival catalogues, image databases and 
other archival resources; creation of binned data products; appli¬ 
cation of automatic and visual screening procedures to check for 
any problems in the data products. The data from this process¬ 
ing have been made available through the XMM-Newton Science 
Archival (XSA). 


The only change applied for identifying exposures to be pro¬ 
cessed by t he pipeline compare d to that adopted in pre-cat9.0 
processing (IWatson et al.l (l2009l) - see their section 4.1), was the 
exclusion of any exposure taken with the Open filter. This was 
done because use of the Open filter leads to increased contamina¬ 
tion from optical light (optical loading). Eight exposures (from 


^ http://xmmssc-www.star.le.ac.uk/public/pipeline/doc/04_cat9.0_20121220.153 
^ http://xmm.esac.esa.int/sas/current/howtousesas.shtml 
http://xmm.esac.esa.int/xsa/ 


3. Data processing 


3.1. Exposure selection 
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Table 1. Characteristics of the 7781 XMM-Newton observations included in the 3XMM-DR5 catalogue. 


Camera 

PulU 

Modes 

Window* 

OtheU 

Thin 

Pilters 

Medium 

Thick 

Total 

pn 

5853 

495 

- 

3327 

2633 

388 

6348 

MOSl 

6045 

1306 

309 

3296 

3774 

590 

7660 

MOS2 

6100 

1341 

248 

3303 

3789 

597 

7689 


" Prime Full Window Extended (PFWE) and Prime Pull Window (PEW) modes; * pn Prime Large Window (PEW) mode and any of 
the various MOS Prime Partial Window (PPW) modes; other MOS modes (Past Uncompressed (PU), Refresh Prame Store (RES)). 


five observations) taken with the Open filter were excluded from 
the data publicly available for the 3XMM-DR5 catalogue. 

3.2. Event list processing 

Much of the pipeline processing that converts raw ODE event file 
data from the EPIC instruments into cleaned event lists has re¬ 
mained unchang ed from the pr e -cat9. 0 pipeline and is described 
in section 4.2 of I Watson et al.l (l2009ll . However, we describe 3 
alterations to the approach used for 2XMM. 

3.2.1. Time-dependent boresight 

Analysis by both the XMM-Newton SSC and the SOC estab¬ 
lished the presence of a systematic, cyclic (si362 day) time- 
dependent variation in the offset of each EPIC (and OM and 
RGS) instrument boresight from their nominal pointing posi¬ 
tions, for each observation. This seasonal dependence is super¬ 
posed on a long term trend, the semi-amplitude of the seasonal 
oscillation being =al .2" in the case of the EPIC instruments 
(iTalavera et al.l 12012) . These variations of the instrument bore- 
sights have been characterised by simple functions in calib ration 
(ITalavera et al.ll20T^ ITalavera & Rodrlguez-PascuallbOldb . The 

origin of the variation is uncertain but might arise from heating 
effects in the support structures of the instruments and/or space¬ 
craft star-trackers - no patterns have been identified in the avail¬ 
able housekeeping temperature sensor data though these may not 
sample the relevant parts of the structure. 

During pipeline processing of XMM-Newton observations 
for the 3XMM catalogues, corrections for this time-dependent 
boresight movement are applied to individual event positions in 
each instrument, via the SAS task attcalc, based on the observa¬ 
tion epochs of the events. 

3.2.2. Suppression of Minimum Ionizing Particle events in 
EPIC-pn data 

High energy particles can produce electron-hole pairs in the sili¬ 
con substrate of the EPIC-pn detector. While onboard processing 
and standard pn event processing in the pipeline re moves most of 
these s o-called minimum ionizing particle events dStriider et al.l 
l2001bl) . residual effects can arise when MIPs arrive during the 
pre-exposure offset-map analysis and can give rise to features 
that appear as low-energy noise in the pn detector. Typically, 
these features are spatially confined to a clump of a few pixels 
and appear only in band 1. However, in pre-cat9.0 pipeline pro¬ 
cessing, such features were sometimes detected as sources dur¬ 
ing source detection and these were not always recognised and 
flagg ed during the manua l flagging process outlined in section 
7.4 of lWatson et al.l (|2009|) . The SAS task, epreject was incorpo¬ 
rated into the pipeline processing for 3XMM and in most cases 
corrects for these MIP events during processing of pn events. 


3.2.3. Optimisecd flare filtering 

In previous pipeline processing (pre-cat9.0 pipelines), the recog¬ 
nition of background flares and the creation of good time in- 
tervals (GTIs ) betwe en them was as described in section 4.3 of 
IWatson et akl (l2009l) . where the background light curves were 
derived from high energy data and the count rate thresholds for 
defining the GTIs were based on (different) constant values for 
each instrument. In the processing for 3XMM, two key changes 
have been made. 

Pirstly, rather than adopting fixed count rate thresholds in 
each instrument, above which data are rejected, an optimisa¬ 
tion algorithm has been applied that maximises the signal-to- 
noise (S/N) for the detection of point sources. Secondly, the light 
curves of the background data used to establish the count rate 
threshold for excluding background flares are extracted in an 
’in-band’ (0.5-7.5 keV) energy range. This was done so that the 
process described below resulted in maximum sensitivity to the 
detection of objects in the energy range of scientific interest. 

The overall process for creating the background flare GTIs 
for each exposure within each observation involved the follow¬ 
ing steps: 

1. Por each exposure, a high energy light curve (from 7 to 15 
keV for pn, > 14 keV for MOS) is created, as previously, and 
initial background flare GTIs are derived using the optimised 
approach employed in the SAS task, bkgoptrate (see below). 

2. Poliowing the identification of bad pixels, event cleaning and 
event merging from the different CCDs, an in-band image 
is then created, using the initial GTIs to excise background 
flares. 

3. The SAS task, eboxdetect then runs on the in-band image to 
detect sources with a likelihood > 15 - this is already very 
conservative as only very bright (likelihood » 100), variable 
sources are able to introduce any significant source variabil¬ 
ity component into the total count rate of the detector (accu¬ 
mulated from most of the field). 

4. An in-band light curve is subsequently generated, excluding 
events from circular regions of radius 60" for sources with 
count rates <0.35 counts/s or 100" for sources with count 
rates >0.35counts/s, centred on the detected sources. 

5. The SAS task, bkgoptrate, is then applied to the light curve 
to find the optimum background rate cut threshold and this is 
subsequently used to define the final background flare GTIs. 

The optimisation algorithm adopted, broadly follows that 
used for the processing of ROSAT Wide Pi eld Camera data for 
the ROSAT 2RE catalogue (iPve et al.ll 19951) . The process seeks 
to determine the background count rate threshold at which the 
remaining data below the threshold yields a S/N ratio, S - 
for a (constant) source that is a maximum. Here Cj is the num¬ 
ber of source counts and Cb is the number of background counts. 
Since we are interested, here, in finding the background rate cut 
that yields the maximum S/N and are not concerned about the 
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absolute value of that S/N, then for background light curves with 
bins of constant width, as created by the pipeline processing, S 
can be expressed as 


S = 


N 


y/Tn 


( 1 ) 


where N is the number of bins with background count rates 
below the threshold, rj, and r, is the count rate in time bin i: the 
summation is over the time bins with a count rate < rj- Time 
bins are of 10s width for pn and 26s for MOS. The process sorts 
the time bins in order of decreasing count rate. Starting from the 
highest count rate bin, bins are sequentially removed, computing 
equation[T]at each step. With the count rate of the bin removed at 
each step representing a trial background count rate cut thresh¬ 
old, this process yields a curve of S/N vs. background count rate 
cut threshold. The background cut corresponding to the peak of 
the S/N curve is thus the optimum cut threshold. 

In figure 0 we show four examples of in-band background 
time series in the top row, accompanied by the respective S/N 
vs. background-cut-threshold plots in the bottom row. The first 
panel in each row represents a typical observation (MOSl) with 
some significant background flaring activity. The optimum cut 
level of 1.83 cts/s leads to the creation of GTIs that exclude por¬ 
tions of the observation where the background exceeds the cut 
threshold. The second panels are for a pn observation with a sta¬ 
ble, low background level. The optimum cut in the background 
includes all the data and thus generates a GTI spanning the en¬ 
tire observation. This is also true for the third panels which show 
a MOS 1 case where the background is persistently high (above 
the level where the whole observation would have been rejected 
in pre-cat9.0 pipeline processing). The fourth panels are for an 
example of a variable background which gives rise to a dou¬ 
ble peaked S/N v background-rate-cut curve. Here, raising the 
threshold from ~18 cts/s to ~28 cts/s simply involves a steeply 
rising background rate early in the observation, causing a dip in 
the S/N verses background-rate-cut curve. However, as the rate 
cut threshold is increased above 30 cts/s, although the count rate 
is higher, a lot more exposure time is available, so the S/N curve 
rises again and the optimum cut includes almost all the data. It 
should be emphasised that the fixed cut thresholds used for MOS 
and pn in previous XMM processings can not be directly com¬ 
pared to the optimised ones used here because of the change in 
energy band being used to construct the background light curve. 
It is, however, worth noting that the fixed cuts used previously 
often result in very similar GTIs to those generated by the opti¬ 
misation process described above. This is because the previous 
fixed instrument thresholds were based on analyses that sought 
to find a representative level for the majority of XMM-Newton 
observations. 

We discuss some of the gains of using this optimisation ap¬ 
proach in section 1973] and some known issues in section[8] 

3.3. Source detection using the empiricai Point Spread 
Function (PSF) fitting 

The bulk reprocessing for 3XMM took advantage of new devel¬ 
opments related to the EPIC PSFs. The source detection stage in 
previous pipelines (IWatson et al.l (l2009l) - see their section 4.4.3) 
made use of the so-called ’default’ (or medium accuracy) PSF 
functions determined by ray tracing of the XMM-Newton mir¬ 
ror systems. However, these default PSF functions recognised 
no azimuthal dependence in the core of the source profile, did 



Time (s) 




0 20 40 60 


Fig. 3. Flare background light curves (top row) and their correspond¬ 
ing S/N vs. background cut threshold plots (bottom row). The leftmost 
panels are for a typical observation with notable background flaring. 
The second pair of vertically aligned panels shows an example where 
the background has a persistently low level, while the third pair of pan¬ 
els reflects an example where the background is persistently high. The 
rightmost panels show an example of a variable background which gives 
rise to a double-peaked S/N vs. background-rate-cut curve. The verti¬ 
cal red lines in the lower panels indicate the optimum background-cut- 
threshold (i.e. the peak of the curve) derived for the light curves in the 
top panels. In the upper panels the applied optimum cut-rate is also 
shown in red as horizontal lines. 


not adequately describe the prominent spoke structures seen in 
source images (arising from the mirror support structures) and 
were created identically for each EPIC camera. 

To address the limitations of the default EPIC PSPs, a set of 
empirical PSPs were constructed, separately for each instrument, 
by careful stacking of observed XMM source images over a grid 
of energy and off-axis angles from the instrument boresights. 
The cores and spoke patterns of the PSFs were then modelled 
independently so that implementation within the XMM-Newton 
SAS calibration software then enables PSFs to be reconstructed 
that take the off-axis and azimuthal locations of a source into 
account, as well as the energy band. The details of the issues as¬ 
sociated with the default PSF and the construction and v alidation 
of the empirical PSF are presented in iRead et al.l d201 ill . 

The use of the empirical PSF has several ramihcations in 
source detection. Firstly, the better representation of structures 
in the real PSF results in more accurate source parameterisa- 
tion. Secondly, it helps reduce the number of spurious detec¬ 
tions found in the wings of bright sources. This is because the 
previous medium accuracy PSFs did not adequately model the 
core and spoke features, leaving residuals during htting that were 
prone to being detected as spurious sources. With the empirical 
PSFs, fewer such spurious detections are found, especially in the 
wings of bright objects positioned at larger (> 6') off-axis angles. 
Thirdly, as a result of the work on the PSFs, the astrometric ac¬ 
curacy of XM M-Newton source positions has been significantly 
improved (see lRead et al.ll20TTl) . 

3.3.1. Other corrections related to the PSF 

During the late stages of testing of the pipeline used for the 
bulk reprocessing that fed into the 3XMM-DR4 catalogue, an 
analysis of XMM-Newton X-ray source positions relative to the 
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high-accuracy (< 0.1") reference positions of Sloan Digital Sky 
Survey (SDSS, DR9) quasars identified a small but significant, 
off-axis-angle-dependent position shift, predominantly along the 
radial vector from the instrument boresight to the source. The 
effect, where the real source position is closer to the instrument 
boresight than that inferred from the fitted PSF centroid, has a 
negligible displacement on axis and grows to ~0.65" at off-axis 
angles of 15'. This radial shift is due to the displacement between 
the true position of a source and the defined centroid (as deter¬ 
mined by a 3-dimensional, circular, Gaussian fit to the model 
PSF profile) of the empirical PSF, which grows as the PSF be¬ 
comes increasingly distorted at high off-axis angles. It should be 
noted that identifying and measuring this effect has only been 
possible because of the corrections for other effects (see sec¬ 
tion 13.31 and below) that masked it, and because of the large 
number of sources available that provide sutficient statistics. In 
due course a correction for this effect will be applied directly to 
event positions, on a per-instrument basis, via the XMM-Newton 
calibration system, but for the 3XMM-DR4 catalogue, to avoid 
delays in its production, a solution was implemented within the 
catcorr SAS task. A correction, computed via a third-order poly¬ 
nomial function, is applied to the initial PSF-fitted coordinates of 
each source output by emldetect, i.e. prior to the field rectifica¬ 
tion step, based on the off-axis angle of the source as measured 
from the spacecraft boresight. This correction is embedded in the 
RA and DEC columns, which also include any rectification cor¬ 
rections (section iTAl) . The correction is computed and applied in 
the same way for both the 3XMM-DR4 and 3XMM-DR5 cata¬ 
logues. 

A second PSF-related problem that affected 2XMMi-DR3 
positions was uncovered during early testing of the empirical 
PSF (see Read et al. 2011). This arose from a 0.5 pixel error 
(in both the x and y directions) in the definition of the pixel co¬ 
ordinate system of the medium-accuracy PSF map - as pixels in 
the PSF map are defined to be 1" x 1", the error is equivalent 
to 0.5" in each direction. When transferred to the image frame 
during PSF fitting in emldetect, this error in the PSF map coor¬ 
dinate system manifested itself as an offset of up to 0.7" in the 
RA/DEC of a source position, varying with azimuthal position 
within the field. The introduction of the empirical PSF removes 
this error. 

3.4. Astrometric rectification 
3.4.1. Frame correction 

Celestial coordinates of sources emerging from the PSF fitting 
step of pipeline processing of a given observation include a gen¬ 
erally small systematic error arising from offsets in the space¬ 
craft boresight position from the nominal pointing direction for 
the observation. The uncertainty is due to imprecisions in the 
attitude solution derived from data from the spacecraft’s star- 
trackers and may result in frame shifts that are typically ~1" (but 
can be as much as 10" in a few cases) in the RA and DEC direc¬ 
tions and a rotation of the field about the boresight of the order 
of 0.1 degrees. To correct for (i.e. rectify) these shifts, an attempt 
is made to cross-correlate sources in the XMM-Newton field of 
view with objects from an astrometric reference catalogue. X-ray 
sources with counterparts in the reference catalogue are used to 
derive the frame shifts and rotation that minimise the displace¬ 
ments between them. In all previous pipeline processing (and 
catalogues derived from them) these frame corrections were es¬ 
timated using the SAS task, eposcorr, which used a single refer¬ 
ence catalogue, USNO-Bl.O, and the SAS task, evalcorr, to de¬ 


termine the success and reliability of the outcome dWatson et al.l 
d2009h - see their section 4.5). 

The processing system used to create the data for the 
3XMM catalogues makes use of some important improvements 
to this field rectification procedure, which are embedded in 
the new SAS tasks, catcorr that replaces eposcorr and eval¬ 
corr. Eirs tly, the new approach incorporates an iterative fitting 
function dNelder & MeadI Il965h to find the optimum frame- 
shift corrections: previously the optimum shift was obtained 
from a grid-search procedure. Secondly, the cross-match be¬ 
tween XMM-Newton and reference catalogue source positions 
is carried out using three reference catalogues: (1) USNO -B1.0 
dMonet et al.1 l20oi . (2) 2MASS dSkrutskie et al.l l2006h and, 
where sky coverage permits, (3) the Sloan Digital Sky Survey 
(DR9) dAbazaiian et al.ll2009t) . The analysis is conducted using 
each catalogue separately. When there is an acceptable fit from 
at least one catalogue, the RA and DEC frame shifts and the ro¬ 
tation derived from the ’best’ case are used to correct the source 
positions. A fit is considered acceptable if there are at least 10 
X-ray/counterpart pairs, the maximum offset between a pair (X- 
ray source, i and counterpart, j) is < 10" and the goodness of fit 
statistic 

tlx flf, 

L = ^ ^ max(0.0, pij - qtj) > 5 (2) 

i=i j=i 

where pij = and qij = no(rijlrf)^. Here, pjj is the 

probability of finding the counterpart at a distance > from the 
X-ray source position given the combined (in quadrature) posi¬ 
tional uncertainty, cr^, while qij is the probability that the coun¬ 
terpart is a random field object within r,y. An estimate of the lo¬ 
cal surface density of field objects from the reference catalogue 
is made by counting the number, of such objects within a cir¬ 
cular region of radius rj (set to 1') around each XMM source, n^ 
is the number of X-ray sources in the XMM field. The L statistic, 
which represents a heuristic approach to the problem of identi¬ 
fying likely matching counterparts, is computed over the set of 
matching pairs and is a measure of the dominance of the close¬ 
ness of the counterparts over the probability of random matches. 
The shifts in RA and DEC and the rotation are adjusted within 
the fitting process to maximise L. Extensive trials found that if 
L > 5, the result is generally reliable. Where more than one ref¬ 
erence catalogue gives an acceptable solution, the one with the 
largest L value is adopted. 

In the 3XMM catalogues, the corrected coordinates are 
placed in the RA and DEC columns; the original uncorrected 
coordinates are reported via the RA_UNC and DEC_UNC 
columns. A catalogue identifier for the catalogue yielding the 
’best’ result is provided in the REECAT column. If the best fit 
has parameter values (e.g. the number of matches used) that fall 
below the specific constraints mentioned above, the original, un¬ 
corrected positions are retained (written to both the RA and DEC 
and RA_UNC and DEC_UNC columns) and the REECAT iden¬ 
tifier takes a negative value. Eurther details may be found in 
the documentation for the catcorr task. This new rectification 
algorithm is successful for about 83% of observations, which 
contain 89% of detections, reflecting a significant improvement 
compared to the previous approach where ~ 65% of fields could 
be corrected. The main gain comes from the use of the 2MASS 
catalogue which is particularly beneficial in obtaining rectifica¬ 
tion solutions in the galactic plane - it should be pointed out that 
similar gains would be obtained with eposcorr if used with the 
expanded set of reference catalogues. It should be noted that the 
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extracted lists of objects from each of the three reference cat¬ 
alogues that lie within the full EPIC field of view for a given 
observation, are provided to users of XMM-Newton data prod¬ 
ucts via the file-type=REFCAT product file, which is used by 
the task, catcorr. 

3.4.2. Systematic position errors 

As discussed in section 9.5 of IWatson et alJ (l2009t) . for the 
2XMM catalogue (and relevant to subsequent incremental cat¬ 
alogues in the 2XMM series), the an gular deviations of SDSS 
(DR5) quasars (ISchneider et al.ll2007h from their XMM-Newton 
X-ray counterparts, normalised by the combined position er¬ 
rors, could not be modelled by the expected Rayleigh distribu¬ 
tion unless an additional systematic uncertainty (SYSERR pa¬ 
rameter in 2XMM) was added to the statistical position error 
(RADEC_ERR parameter in 2XM M and see Appendi x O de¬ 
rived during the PSE fitting process. lWatson et al.l(l2009l) showed 
that this systematic was not consistent with the uncertainty aris¬ 
ing from the rectification procedure used for the 2XMM pro¬ 
cessing and ultimately adopted an empirically-determined sys¬ 
tematic error value that produced the best match between the 
distribution of XMM-quasar offsets and the expected Rayleigh 
curve. 

As part of the upgrade applied to the rectification process 
for the bulk reprocessing used for the 3XMM catalogues, the 
uncertainty arising from this step has been computed, in partic¬ 
ular, taking into account the error component arising from the 
rotational offset. Errors (Icr) in each component, i.e., on the RA 
offset, AcTc, on the DEC offset (Ab^) and on the rotational an¬ 
gle offset (A0c), have been combined in quadrature to give an 
estimate of the total positional uncertainty, Ar, arising from the 
rectification process as 

Ar = [(Aa,)2 + (3) 

where Oc is the radial off-axis angle, measured in the same 
units as Aac and Adc and A<pc is in radians. 

Inclusion of this rectification error (column SYSERRCC in 
the 3XMM catalogues, see Appendix O. in quadrature with the 
statistical error, leads to a generally good agreement between 
the XMM-quasar offset distribution and the expected Rayleigh 
distribution compared to the previous approach and indicates 
that the empirically-derived systematic used in pre-3XMM cat¬ 
alogues is no longer needed. This is discussed further in sec¬ 
tion [92] 

3.5. Energy Conversion Factors (ECFs) 

A number of improvements in the calibration of the MOS and 
pn instruments have occurred since the previous, 2XMMi-DR3, 
catalogue was produced, which lead to slight changes in the en¬ 
ergy conversion factors (ECEs) that are used for converting coun t 
rates in the EPIC energy bands to fluxes fsee lWatson et akl (120091) 
section 4.6 and see Appendix O. Of note is the fact that MOS 
redistribution matrices were provided for 13 epochs at the time 
of processing for 3XMM and for three areas of the detector that 
reflect t he so-called ’patch’, ’wings-of-patch’ and ’off-patch’ lo¬ 
cations dSembav et al.ll20ll]) . 

Eor the 3XMM catalogues a simple approach has been 
adopted. ECFs wer e computed following the prescription of 
iMateos et al.l (l2009l) . for energy bands 1 to 5 (0.2-0.5 keV, 0.5- 
1.0 keV, 1.0-2.0 keV, 2.0-4.5 keV and 4.5-12.0 keV respec¬ 
tively) and band 9 (0.5-4.5 keV), for full-frame mode, for each 


EPIC camera, for each of the Thin, Medium and Thick filters. 
A power-law spectral model with a photon index, T = 1.7 and 
a cold absorbing column density of Nh = 3 x 10^° cm“^ was 
assumed. As such, users are reminded that the ECFs, and hence 
the fluxes provided in the 3XMM catalogues, may not accurately 
reflect those for specific sources whose spectra di ffer apprecia¬ 
bly fro m this power-law model - see section 4.6 of lWatson et al.l 

(^OOl . 

For pn, the ECFs are calculated at the on-axis position. The 
pn response is sufficiently stable that no temporal resolution is 
needed. For MOS, to retain a direct connection between the 
ECFs and publicly available response files, the ECFs used are 
taken at epoch 13 and are for the ’off-patch’ location. The lat¬ 
ter choice was made because the large majority of detections 
in an XMM-Newton field lie outside the ’patch’ and ’wings-of- 
patch’ regions, which only relate to a region of radius < 40", 
near the centre of the field. The use of a single epoch (epoch 13) 
was made to retain simplicity in the processing and because the 
response of the MOS cameras exhibits a step function change 
(due to a gain change) between epochs 5 and 6, with different 
but broadly constant values either side of the step. None of the 
13 calibration epochs represent the average response and thus 
no response file exists to which average ECFs can be directly re¬ 
lated. The step-function change in the responses for MOS is most 
marked in band 1 (0.2-0.5 keV) for the ’patch’ location, where 
the maximum range in ECFs either side of the step amounts to 
20%. Outside the ’patch’ region, and for all other energy bands, 
the range of the ECF values with epoch is < 5% and is < 2.5% 
for the ’off-patch’ region. Epoch 13 was chosen, somewhat ar¬ 
bitrarily, as being typical of epochs in the longer post-step time 
interval. 

The ECFs, in units of 10*' cts cm^ erg ', adopted for the bulk 
reprocessing of data used for 3XMM, are provided in Table|2] for 
each camera, energy band and filter. The camera rate, ca_RATE, 
and flux, ca_FLUX, are related via ca_FLUX - {ca_RATEIEC¥) 
(where ca is PN, Ml or M2) 


Table 2. Energy conversion factors (in units of lO" cts cm^ erg^*) used 
to convert count rates to fluxes for each instrument, filter and energy 
band 


Camera 

Band 

Thin 

Filters 

Medium 

Thick 

pn 

1 

9.52 

8.37 

5.11 


2 

8.12 

7.87 

6.05 


3 

5.87 

5.77 

4.99 


4 

1.95 

1.93 

1.83 


5 

0.58 

0.58 

0.57 


9 

4.56 

4.46 

3.76 

MOSl 

1 

1.73 

1.53 

1.00 


2 

1.75 

1.70 

1.38 


3 

2.04 

2.01 

1.79 


4 

0.74 

0.73 

0.70 


5 

0.15 

0.15 

0.14 


9 

1.38 

1.36 

1.20 

MOS2 

1 

1.73 

1.52 

0.99 


2 

1.76 

1.71 

1.39 


3 

2.04 

2.01 

1.79 


4 

0.74 

0.73 

0.70 


5 

0.15 

0.15 

0.15 


9 

1.39 

1.36 

1.21 
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3.6. Updated flagging procedures 

A significant issue in terms of spurious detections in XMM- 
Newton data arises from detections associated with out-of-time 
(OoT) events. For sources that do not suffer signihcantly from 
pile-up, the background map used by emldetect includes a com¬ 
ponent that models the OoT features. However, for sources 
where pile up is signihcant, the OoT modelling is inadequate. 
This can give rise to spurious sources being detected along OoT 
features. For the more piled up objects, the numbers of spurious 
detections along OoT features can become large (tens to hun¬ 
dreds). 

Another feature arising from bright sources that affects the 
MOS instruments is scattered X-rays from the Reflection Grat¬ 
ing Arrays (RGA). These manifest themselves as linear features 
in MOS images passing through the bright object, rather similar 
in appearance to OoT features. These features are not modelled 
at all in the background map. 

In previous catalogues, spurious detections associated with 
OoT and RGA features have simply been masked during manual 
screening. In the cat9.0 pipeline, for the hrst time, an attempt has 
been made to identify the presence of OoT and RGA features 
from piled up sources and to flag detections that are associated 
with them. 

The SAS task, eootepileupmask, is used for this purpose. 
This task uses simple instrument (and mode) -dependent pre- 
dehned thresholds to test pixels in an image for pile-up. Where 
it detects pixels that exceed the threshold, the column containing 
that pixel is flagged in a mask map for the instrument. The task 
attempts to identify and mask columns and rows associated with 
such pixels in OoT and RGA features. 

Once the pile up masks are generated, the SAS task, dpssflag 
is used to set flag 10 of the PN_FLAG, M1_FLAG, M2_FLAG, 
EP_FLAG columns in the catalogues for any detection whose 
centre lies on any masked column or row. 


4. Source-specific product generation 

4.1. Optimised spectral and time series extraction 

The pipeline processing automatically extracts spectra and time 
series (source-specihc products, SSPs), from suitable exposures, 
for detections that meet certain brightness criteria. 

In pre-cat9.0 pipelines, extractions were attempted for any 
source which had at least 500 EPIC counts. In such cases, source 
data were extracted from a circular aperture of hxed radius (28"), 
centred on the detection position, while background data were 
accumulated from a co-centred annular region with inner and 
outer radii of 60" and 180", respectively. Other sources that 
lay within or overlapped the background region were masked 
during the processing. In most cases this process worked well. 
However, in some cases, especially when extracting SSPs from 
sources within the small central window of MOS Small-Window 
mode observations, the background region could comprise very 
little usable background, with the bulk of the region lying in the 
gap between the central CCD and the peripheral ones. This re¬ 
sulted in very small (or even zero) areas for background rate 
scaling during background subtraction, often leading to incorrect 
background su btraction during the analysis of spectra in XSPEC 
(lArnaudlll996h . 

For the bulk reprocessing leading to the 3XMM catalogues, 
two new approaches have been adopted and implemented in the 
cat9.0 pipeline. 


1. The extraction of data for the source takes place from an 
aperture whose radius is automatically adjusted to maximise 
the signal-to-noise (S/N) of the source data. This is achieved 
by a curve-of-growth analysis, performed by the SAS task, 
eregionanalyse. This is especially useful for fainter sources 
where the relative importance of the background level is 
higher. 

2. To address the problem of locating an adequately hlled back¬ 
ground region for each source, the centre of a circular back¬ 
ground aperture of radius, ri, - 168" (comparable area to 
the previously used annulus) is stepped around the source 
along a circle centred on the source position. Up to 40 uni¬ 
formly spaced azimuthal trials are tested along each circle. 
A suitable background region is found if, after masking out 
other contaminating sources that overlap the background cir¬ 
cle and allowing for empty regions, a hlling factor of at 
least 70% usable area remains. If none of the background 
region trials along a given circle yields sufficient residual 
background area, the background region is moved out to a 
circle of larger radius from the object and the azimuthal tri¬ 
als are repeated. The smallest trial circle has a radius, rc, of 
rc - rb + 60" so that the inner edge of the background region 
is at least 60" from the source centre - for the case of MOS 
Small-Window mode, the smallest test circle for a source in 
the central CCD is set to a radius that already lies on the 
peripheral CCDs. Other than for the MOS Small-Window 
cases, a further constraint is that, ideally, the background re¬ 
gion should lie on the same instrument CCD as the source. 
If no solution is found with at least a 70% hlling factor, the 
background trial with the largest hlling factor is adopted. 

For the vast majority of detections where SSP extraction is 
attempted, this process obtains a solution in the hrst radial 
step and a strong bias to early azimuthal steps, i.e. in most 
cases an acceptable solution is found very rapidly. For detec¬ 
tions in the MOS instruments, about 1.7% lie in the central 
window in Small-Window mode and have a background re¬ 
gion located on the peripheral CCDs. Importantly, in contrast 
to earlier pipelines, this process always yields a usable back¬ 
ground spectrum for objects in the central window of MOS 
Small-Window mode observations. 

This approach to locating the background region was 
adopted primarily to provide a single algorithm that works 
for all sources, including those located in the MOS small 
window, when used. However, a drawback relative to the 
use of the original annular background region arises where 
sources are positioned on a notably ramped or other spatially 
variable background (e.g. in the wings of cluster emission), 
where the background that is subtracted can vary, depending 
on which side of the source the background region is located. 


In addition, the cat9.0 pipeline permits extraction of SSPs for 
fainter sources. Extraction is considered for any detection with 
at least 100 EPIC source counts (EP_8_CTS). Where this condi¬ 
tion is met, a spectrum from the source aperture (i.e. source plus 
background) is extracted. If the number of counts from spectrum 
channels not flagged as ’bad’ (in the sense adopted by XSPEC) is 
> 100, a spectrum and time series are extracted for the exposure. 
The initial hlter on EPIC counts is used to limit the processing 
time as, for dense helds, the above background location process 
can be slow. 
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4.2. Attitude GTI filtering 


5 . Screening 


Occasionally, the spacecraft can be settling on to, or begin mov¬ 
ing away from, the intended pointing direction within the nom¬ 
inal observing window of a pointed XMM-Newton observation, 
resulting in notable attitude drift at the start or end of an expo¬ 
sure. Image data are extracted from events only within ’Good 
Time Intervals’ (GTIs) when the pointing direction is within 3' 
of the nominal pointing position for the observation. However, in 
pre-cat9.0 pipelines, spectra and time series have been extracted 
without applying such attitude GTI filtering. Occasionally, this 
resulted in a source location being outside or at the edge of the 
field of view when some events were being collected, leading 
to incorrect transitions in the time series. In some cases, these 
transitions gave rise to the erroneous detection of variability in 
subsequent time series processing. In the cat9.0 pipeline, attitude 
GTI filtering is applied during the extraction of spectra and time 
series. 


4.3. Variability characterisation 

As with pre-cat9.0 pipeline processing, the pipeline processing 
for the 3XMM catalogues subjects each extracted exposure-level 
source time series to a test for variability. This test is a simple 
a nalysis for the null hypothesis of a constant source count 
rate (IWatson et al.l (l2009l) - see their section 5.2). Sources with a 
probability < 10'^ of being constant have been flagged as being 
variable in previous XMM-Newton X-ray source catalogues and 
this same approach is adopted for 3XMM. 

In addition, for 3XMM, we have attempted to characterise 
the scale of the variability through the fractional variability 
amplitude, F^.ar (provided via the PN_FVAR, M1_FVAR and 
M2_FVAR columns and associated FVARRERR columns), which 
is simply the square root of the excess variance, after normalisa¬ 
tion by the mean count rate, {R}, i.e. 


(S2-(cr,rr^}) 


(Rr 


(4) 


(e.g. lEdelson et al.l (l2002h : iNandra et al.l (Il997h and refer¬ 
ences therein), where is the observed variance of the time 
series with N bins, i.e. 


s’= 

/=! 

in which R, is the count rate in time bin i. For the calcula¬ 
tion of the excess variance, (S^ - (cTerr^)), which measures the 
level of observed variance above that expected from pure data 
measurement noise, the noise component, {(Terr^), is computed 
as the mean of the squares of the individual statistical errors, erf, 
on the count rates of each bin, i, in the time series. 

The uncertainty, AfFynA, on F^gr, is calculate d following 
equation B2 in appendix B of IVaughan et alJ (l2003h . i.e. 


MFyar) = 


ri~ {Werr^) ^ / j {cTerr^} 1 ^ 

yiNiRfFyJ N {R}j _ 


This takes account of the statistical errors on the time bins 
but not scatter intrinsic to the underlying variability process. 


As for prev i ous X MM-Newton X-ray source catalogues 
(IWatson et al.l (l2009t) - see section 7), every XMM-Newton ob¬ 
servation in the 3XMM catalogues has been visually inspected 
with the purpose of identifying problematic areas where source 
detection or source characterisation are potentially suspect. The 
manual screening process generates mask files that define the 
problematic regions. These may be confined regions around in¬ 
dividual suspect detections or larger areas enclosing multiple af¬ 
fected detections, up to the full area of the field where serious 
problems exist. Detections in such regions are subsequently as¬ 
signed a manual flag (flag 11) in the flag columns {PN_FIAG, 
M1_FLAG, M2_FLAG, EP_FLAG). It should be noted that a de¬ 
tection with flag 11 set to (T)rue does not necessarily indicate 
that the detection is considered to be spurious. 

One significant change to the screening approach adopted for 
3XMM relates to the flagging of bright sources and detections 
within a halo of suspect detections around the bright source. Pre¬ 
viously, all detections in the halo region, including the primary 
detection of the bright source itself (where discernible), had flag 

11 set to True (manual flag) but the primary detection of the 
bright object itself, also had flag 12 set. The meaning of flag 

12 there was to signify that the bright object detection was not 
considered suspect. The use of flag 12 in this ’negative’ context, 
compared to the other flags, was considered to be potentially 
confusing. For this reason, for the 3XMM catalogues, we have 
dropped the use of flag 12 and simply ensured that, where the 
bright object detection is clearly identified, it is un-flagged (i.e. 
neither flag 11 or 12 are set). Effectively, flag 12 is not used in 
3XMM. It should be noted that bright sources that suffer signifi¬ 
cant pile-up are not flagged in any way in 3XMM (or in previous 
XMM-Newton X-ray source catalogues). 

The masked area of each image is an indicator of the quality 
of the field as a whole. Farge masked areas are typically associ¬ 
ated with diffuse extended emission, very bright sources whose 
wings extend across much of the image, or problems such as arcs 
arising from single-reflected X-rays from bright sources just out¬ 
side the field of view. The fraction of the field of view that is 
masked is characterised by the observation class {OBSjCLASS) 
parameter. The distribution of the six observation classes in the 
3XMM catalogues has changed with respect to 2XMMi-DR3 
(see table O. The dominant change is in the split of fields as¬ 
signed observation classes 0 and 1, with more fields that were 
deemed completely clean in 2XMMi-DR3 having very small ar¬ 
eas (generally single detections) being marked as suspect in the 
3XMM catalogues. Often these are features that were consid¬ 
ered, potentially, to be unrecognised bright pixels, e.g. detec¬ 
tions dominated by a single bright pixel in one instrument with 
no similar feature in the other instruments. It should be empha¬ 
sised, however, that the manual screening process is unavoidably 
subjective. 

Table 3. 3XMM observation classification {OBSjCLASS) (first col¬ 
umn), percentage of the field considered problematic (second column) 
and the percentage of fields that fall within each class for 2XMMi-DR3 
and 3XMM-DR5 (third and fourth columns respectively) 


OBS CLASS 

masked fraction 

2XMMi-DR3 3XMM-DR5 

0 

bad area = 0% 

38% 

27% 

1 

0% < bad area <0.1% 

12% 

22% 

2 

0.1% < bad area < 1% 

10% 

12% 

3 

1% < bad area < 10% 

25% 

24% 

4 

10% < bad area < 100% 

10% 

11% 

5 

bad area = 100% 

5% 

4% 
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6. Catalogue construction: unique sources 

The 3XMM detection catalogues collate all individual detections 
from the accepted observations. Nevertheless, since some fields 
have at least partial overlaps with others and some targets have 
been observed repeatedly with the target near the centre of the 
field of view, many X-ray sources on the sky were detected more 
than once (up to 48 times in the most extreme case). Individ¬ 
ual detections have been assigned to unique sources on the sky 
(i.e. a common unique source identifier, SRCID, has been al¬ 
located to detections that are considered to be associated with 
the same unique source) using the procedure outlined here. The 
process used in constructing the 3XMM catalogue s has changed 
from t hat used for the 2XMM series of catalogues (IWatson et alJ 
(l200l - see their section 8.1). 

The matching process is divided into two stages. The first 
stage finds, for each detection, all other matching detections 
within 15" of it, from other fields (i.e. excluding detections from 
within the same observation, which, by definition, are regarded 
as arising from distinct sou rces) and computes a Baye sian match 
probability for each pair as iBudavari & Szalavl j2008l) 


(5) 

Here, B, the Bayes factor, is given by 


Pmatch 


1 -H 


■po 


B ■ po 


B = 


(0-2 -H 0-2) 


exp- 


r 


2 ( 0-2 ^ ^ 2 ^ 


(6) 


where cri and 0-2 are the positional error radii of each detec¬ 
tion in the pair (in radians) and if/ is the angular separation be¬ 
tween them, in radians, po - N,/NiN 2 where Ni and N 2 are the 
numbers of objects in the sky based on the surface densities in 
the two fields and N* is the number of objects common between 
them. Each of these N values is derived from the numbers of de¬ 
tections in the two fields and are then scaled to the whole sky. 
The value of N, is not known, a priori, and in general can be ob¬ 
tained iteratively by running the matching algorithm. However, 
here we are matching observations of the same field taken with 
the same telescope at two different epochs so that in most cases, 
objects will be common. Of course this assumption is affected by 
the fact that the two observations being considered may involve 
different exposure times, different instruments, filters and modes 
used and different boresight positions (with sources within their 
fields of view being subject to different vignetting factors). To 
gauge the impact of such effects in determining N*, trials us¬ 
ing an iterative scheme were run, which indicated that taking 
N, = Q.9min(N\,N2) provides a good estimate of N* without the 
need for iteration. Finally, with all pairs identified and probabil¬ 
ities assigned, pairs with pmatch < 0.5 were discarded. 

In the second, clustering stage, a figure-of-merit is computed 
for each detection, referred to here as the goodness-of-clustering 
(GoC), which is the number of matches the detection has with 
other detections, normalised by the area of its error circle ra¬ 
dius (given by POSERR, see Appendix O. This GoC measure 
prioritises detections that lie towards the centre of a group of de¬ 
tections, and are thus likely to be most reliably associated with 
a given unique source. The list of all detections is then sorted by 
this GoC value. The algorithm works down the GoC-sorted list 
and for each detection, the other detections it forms pairs with 
are sorted by Pmatch- Then, descending this list of pairs, for each 
one there are four possibilities for assigning the unique source 


identifiers: i) if both detections have previously been allocated 
to a unique source and already have the same SRCID, nothing is 
done, ii) if neither have a SRCID, both are allocated the same, 
new SRCID, iii) if only one of them has already been assigned 
a SRCID, the other is allocated the same SRCID, iv) where both 
detections in the pair have allocated but different SRCIDs, this 
represents an ambiguous case - for these, the existing SRCIDs 
are left unchanged but a confusion flag is set for both detections. 

This approach is reliable in matching detections into unique 
sources in the large majority of cases. Nevertheless, there are 
situations where the process can fail or yield ambiguous results. 
Examples typically arise in complex regions, such as where spu¬ 
rious sources, associated with diffuse X-ray emission or bright 
sources, are detected and, by chance, are spatially close to the 
positions of other detections (real or spurious) in other observa¬ 
tions of the same sky region. Often, in such cases, the detections 
involved will have manual quality flags set (IWatson et alJ (l2009h 
- see their section 7.5 and also section |5] above. 

Other scenarios that can produce similar problems include i) 
poorly centroided sources, e.g. those suffering from pile-up or 
optical loading, ii) cases where frame rectification (see 13. 41) fails 
and positional uncertainties are larger than the default frame- 
shift error of 1.5" that is adopted for un-rectified fields, iii) 
sources associated with artefacts such as out-of-time event fea¬ 
tures arising from bright objects elsewhere in the particular 
CCD, or residual bright pixels and iv) where multiple detections 
of sources that show notable proper motion (which is not ac¬ 
counted for in pipeline processing) can end up being grouped 
into more than one unique source along the proper motion vec¬ 
tor. Overall, in 3XMM-DR5, this matching process has associ¬ 
ated 239505 detections with 70453 unique sources that comprise 
more than one detection. 

7. External catalogue cross-correlation 

The XMM-Newton pipeline includes a specific module, the As¬ 
tronomical Catalogue Data Subsystem (ACDS), running at the 
Observatoire de Strasbourg. This module lists possible multi¬ 
wavelength identifications and generates optical finding charts 
for all EPIC detections. Information on the astrophysical con¬ 
tent of the EPIC field of view is also provided by the ACDS. 

When possible, finding charts are built using g-, r- and i- 
band images extracted from the SDSS image server and assem¬ 
bled in false colours. Outside of the SDSS footprint, images are 
extracted from the Aladin image server. The list of archival astro¬ 
nomical catalogues used during the 3XMM processing includes 
updated versions of those used for the 2XMM and adds some of 
the most relevant catalogues published since 2007. A total of 228 
catalogues were queried including Simbad and NED. Note that 
NED entries already included in ACDS catalogues (e.g. SDSS) 
were discarded. 

Among the most important additions are:^ __ 

i) the Chandra source catalogue version 1.1 (lEvans et al.ll2010h . 

This release contains point and compact source data extracted 
from HRC images as well as available ACIS data public at the 
end of 2009. ACDS accesses the Chandra source catalogue 
using the VO cone search protocol, __ 

ii) the Chandra ACIS survey in 383 nearby galaxie s (lLiull201 ih. 
hi) th e SDSS Photometric Catalog, Release 8 (lAihara et al.l 

HmH), 

iv) the MaxBCG galaxy clusters catalogue from SDSS 
(iKoester et al.ll2007l) . 

v) th e 2XMMi/SDSS DR7 cross-correlation (iPineau et al.l 

HoO), 
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Table 4. Cross-matching statistics between 3XMM sources and other 
catalogues. 


Catalogue 

Detections 

Catalogue 

Detections 

Chandra src cat. 

63,676 

Chandra gal. 

9,908 

SDSS8 

129,252 

RAVE 

219 

USNO-Bl.O 

229,730 

IPHAS 

38 

WISE 

454,957 

AKARI 

5,598 

2MASS 

36,830 

GLIMPSE 

35,572 

Simbad 

204,657 

Planck ERCSC 

43,136 

NED 

296,914 




vi) the 3rd release of the RAVE catalogue (ISiebert e t al.ll201 lb, 

vii) th e IPHAS Ha emission line source catalogue dWitham et al.l 

l2008h . __ 

viii) the WISE All-Sky data Release (ICutri & e t al.ll2012 ), 

ix) the AKARI mid-IR all-sky survey dlshihara et al. 120101) 
and version 1.0 of the all-sky survey bright source catalogue 
dYamamura et al.ll2010h . 

x) the Spitzer IRAC survey of the galactic center dRamfrez et al.l 

120081 ) . 


xi) the GLIMP SE Source Catalogue (I -H II -H 3D 
IChurchwell et al.ll2009l) . 

xii) th e IRAC-24micron optical source catalogue dSurace et al.l 
[2001 


and xiii) the Planck Early Releas e Compact Source Catalogue 
( jPlanck Collaboration et al. 11201 lb . 

Table |4]lists, for a selection of archival catalogues, the num¬ 
ber of EPIC detections having at least one catalogue entry in the 
99.73% (3 Gaussian cr) conhdence region. 

The cross-matching method used for 3XMM is identical to 
that applied in the former XMM catalogues. Briefly speaking, 
ACDS retains all archival catalogue entries located within the 
99.73% conhdence region around the position of the EPIC detec¬ 
tion. The corresponding error ellipse takes into account system¬ 
atic and statistical uncertainties on the positions of both EPIC 
and archival catalogue entries. The 3XMM implementation of 
the ACDS assumes that the error distribution of EPIC positions 
is represented by the 2-D Gaussian distribution 


f{SRA,5DEc) 


1 


> / RA , 


'27TO-RAO-dec 


with 


o-RA = o-dec = -y/RADEC_ERRV2 + SYSERRCCV2 

thereby correcting for the overestimated error value used during 
the 2XMM processing. 

ACDS identihcations are not part of the 3XMM catalogue 
hts hie but are made available to the community through the 
XSA and through the XCAT-D^ a dedicated in terface devel¬ 
oped by the SSC in Strasbourg (iMotch et al.ll2009i Michel et al. 
in press). The XCAT-DB also gives access to the entire 3XMM 
catalogue and to some of the associated pipeline products such 
as time series and spectra. Quick look facilities and advanced 
selection and extraction methods are complemented by simple 
X-ray spectral htting tools. In the near future, the database will 
be enriched by the multi-wavelength statistical identihcations 
and associated spectral energy distributions computed within the 

^ http://xcatdb.unistra.fr 


ARCHES project dMotch & Arches Consortium! [2014 ). Spec¬ 
tral h tting results from the XMMEITCAT database ( Corral et’aH 
1201 4b are also partially available. 


8. Known problems in the catalogue 

A number of small but signihcant issues have been identihed that 

affect the data in the 3XMM catalogues. Two of these affect both 

the 3XMM-DR4 and 3XMM-DR5 catalogue. The other issues 

affect only 3XMM-DR4 and are described in annendixlAl 

1. The optimised Hare filtering process (see section 13.2.3b re¬ 
turns a background rate threshold for screening out back¬ 
ground hares for each exposure during processing. How¬ 
ever, while this process generally works well, when the back¬ 
ground level is persistently high throughout the observation, 
the optimised cut level, while formally valid, can still re¬ 
sult in image data with a high background level. In princi¬ 
ple, such cases could be identihed by testing the cut thresh¬ 
old against a pre-determined benchmark for each instrument. 
However, this is complicated by the fact that, since the anal¬ 
ysis is now measured in-band, apparently high background 
levels can also arise in fields containing large extended 
sources. To simplify the process of identifying affected 
helds, a visual check was performed during manual screen¬ 
ing and helds where high background levels were suspected 
were noted and detections from those helds are Hagged in 
the 3XMM catalogues via the HIGH_BACKGROUND col¬ 
umn. This screening approach has been somewhat conserva¬ 
tive and subjective. A total of 21779 (20625) detections from 
568 (552) XMM-Newton observations are hagged as such in 
the 3XMM-DR5: numbers in parentheses are for 3XMM- 
DR4. 

2. A further issue recognised in the 3XMM catalogue is 
that of detections in the previous 2XMMi-DR3 catalogue 
that are not present in the 3XMM catalogue. There are 
4921 XMM-Newton observations that are common between 
2XMMi-DR3 and the 3XMM-DR5 catalogues, resulting in 
349444 detections in 2XMMi-DR3 and 359505 detections 
in 3XMM-DR5. Of these, there are 274564 point-like de¬ 
tections with a sumhag<l in 2XMMi-DR3 and 283436 in 
3XMM-DR5. However, amongst these observations, there 
are ~54000 detections that appear in 2XMMi-DR3 that are 
not matched with a detection in the same observation in the 
3XMM-DR5 catalogue within 10". About 25700 of these 
were classihed as the cleanest (SUM_FLAG< 1), point-like 
sources in 2XMMi-DR3 - these are referred to as missing 
3XMM detections in what follows. It should be noted that 
in reverse, there are ~64000 detections in the 3XMM cat¬ 
alogues that are in common observations but not matched 
with a detection in 2XMMi-DR3 within 10", approximately 
33600 of which are classed as being clean and point-like. 
The details explaining these ’missing sources’ are given in 
Appendix |Dl but the main reasons for the source discrep¬ 
ancies between the two catalogues are two of the major 
improvements to the 3XMM catalogue with respect to the 
2XMM catalogue, namely the new empirical PSE, described 
in Sec l3.3l and the optimised Hare hltering (see Sec l3.2.3l l. 
Other origins, such as MIP events which were present in 
2XMMi-DR3 but not recognised as such, and mostly re¬ 
moved in 3XMM-DR5, also contribute, but to a lesser extent, 
to the missing sources. In general though, the improvements 
to the pipeline that was used to create the 3XMM-DR5 data 
introduce (generally small) changes in the likelihood values 
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Fig. 4. Numbers of 3XMM-DR5 unique sources comprising given num¬ 
bers of repeat detections. 

(of mostly real sources). As such, the imposed threshold cut 
at L-6 for inclusion in the catalogue results in a fraction of 
sources that had L<6 in 2XMMi-DR3 having a likelihood a 
little above it in the 3XMM-DR5 processing, and vice versa, 
leading to losses and gains between the catalogue. Overall, 
the processing for 3XMM-DR5 is shown to be an improve¬ 
ment over the 2XMMi-DR3 procedure (see Appendix IdTi . 
resulting in more sources. 

9. Catalogue characterisation 

9.1. General properties 

The 3XMM-DR5 catalogue contains 565962 (531261) detec¬ 
tions, associated with 396910 (372728) unique sources on the 
sky, extracted from 7781 (7427) public XMM-Newton observa¬ 
tions - numbers in parentheses are for 3XMM-DR4. Amongst 
these, 70453 (66728) unique sources have multiple detections, 
the maximum number of repeat detections being 48 (44 for 
3XMM-DR4), see fig. g] 55640 X-ray detections in 3XMM- 
DR5 are identified as extended objects, i.e. wi th a core ra¬ 
dius p arameter, rcore, as defined in section 4.4.4 of IWatson et al.l 
(|200^ . > 6", with 52493 of these having rcore < 80". Over¬ 
all properties in terms of completeness and false detection rates 
are not expected to differ significantly from those described in 
I Watson et al.l (l2009t) . 

9.2. Astrometric properties 

As outlined in section [34l several changes have been made to the 
processing that affect the astrometry of the 3XMM catalogues 
relative to previous XMM-Newton X-ray source catalogues. To 
assess the quality of the current astrometry, w e have broadly 
followed the approach outlined in IWatson et al.l (l2009t) . Detec¬ 
tions in the 3XMM-DR5 catalogue were cross-correlated against 
the SDSS DR12Q quasar catalogue (Paris et al. in prep.), which 
contains ~297300 objects spectroscopically classified as quasars 
- positions and errors were taken from the SDSS DR9 cata¬ 
logue. X-ray detections with an SDSS quasar counterpart within 
15" were extracted. Point-like 3XMM-DR5 detections were se¬ 
lected with summary flag 0 (see AppendixO. from successfully 


cflfcorr-corrected fields, with EPIC detection log-likelihood >8 
and at off-axis angles < 13'. The SDSS quasars were required 
to have warning flag 0, morphology 0 (point-like) and r’ and g’ 
magnitudes both <22.0. This yielded a total of 6614 3XMM- 
QSO pairs. In the 13 cases where more than one optical quasar 
match was found within 15", the nearest match was retained. 

The cross-matching used the catcorr-corrected RA and DEC 
X-ray detection coordinates. The measured separation, Ar, 
and the overall 1-dimensional XMM position error, cr\£, (= 
cTposI V2), were recorded. Here o-pos is the radial positional error, 
POSERR, in the catalogues, which is the quadrature sum of the 
XMM positional unc ertainties reso l ved in the RA and DEC di¬ 
rections. As noted bv I Watson et al.l (l2009l) . if the offset between 
the X-ray source and its SDSS quasar counterpart, Ar, is nor¬ 
malised by the total position error, atot, i-e. x = Arjcrtoi, the dis¬ 
tribution of these error-normalised offsets is expected to follow 
the Rayleigh distribution, 

N(x)dx oc xe~’‘^^^dx (7) 

Errors on the SDSS quasar positions were included in cr,^, 
though they are generally < 0.1", much smaller than the vast 
majority of cr\D values in 3XMM-DR5. The SDSS position er¬ 
rors were circularised using ctqso - where 

cTmai and cTmin are the errors in the major and minor axis di¬ 
rections of the SDSS position error ellipse. These were then 
combined in quadrature with the XMM position error to obtain 
rrtot - systematic error was included for the 

QSO position error. 

In Eig. |5] we show the distribution of x values for the se¬ 
lected XMM-QSO pairs as the red histogram, with the expected 
Rayleigh distribution overlaid in black. While there is broad 
overall agreement between the data and model, it is clear that 
there is a deficit of sources for 0.8 < x < 2 and an excess for 
X > 2.5. A total of 739 XMM-QSO pairs lie at 2.5 < x < 6 while 
the model predicts 291, the excess of 448 representing 6.8% of 
the total in the histogram. 

To investigate the small discrepancies between the distri¬ 
bution of X values for the selected XMM-QSO pairs and the 
Rayleigh distribution, we carried out a number of tests detailed 
in Appendix]^ The main results of these tests indicate that the 
excess of 3XMM-DR5 detections with error-normalised offsets 
from their SDSS quasar counterparts > 3.5 appears to have a 
modest dependence on the off-axis location of the detection in 
the XMM field of view. A small fraction of detections at higher 
off-axis angles have either incorrect positions or underestimated 
errors, while sources near the centre may have slightly overes¬ 
timated errors. Eurther, given that the sources at higher off-axis 
angles in EPIC images have rather elongated PSE profiles, the as¬ 
sumption of a circularly symmetric positional uncertainty distri¬ 
bution is probably not adequately representative of such sources, 
which may be contributing to the observed discrepancies. 

Subsequent analysis investigated whether the discrepancies 
could be reduced by making phenomenological adjustments to 
the XMM position errors. In this analysis, the filtering applied to 
XMM and SDSS sources was similar to that outlined above but 
only matches within 5" were used and no magnitude limits were 
imposed on the SDSS objects, resulting in 6858 pairs. A two pa¬ 
rameter adjustment was considered in which the XMM position 
errors were scaled by a constant, a, and a systematic error, b, was 
added in quadrature (i.e. = {a^cr\j^ + b^)^). One parameter 

adjustments, where only the systematic was added (i.e. where a 
is set to 1) were also tested. The error normalised XMM-quasar 
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Fig. 5. Distribution of position-error-normalised offsets between 
3XMM-DR5 X-ray sources and SDSS quasar counterparts (red his¬ 
togram). The expected Rayleigh distribution is overlaid (black). The 
XMM position errors are as provided in the 3XMM catalogues (i.e. 
unadjusted, with no scaling or systematic included). Also shown are 
similar histograms for data from EPIC off-axis angles, 6, in the ranges 
6 < 5' (blue), 5' < 6 < 10' (green) and 10' < 6 < 15' (grey). The data 
are normalised to unit area. 


separations were recomputed as x' - ^rlcr',gi, where crj^, now 
combines cr'^ with ctqso in quadrature. Using this prescription, 
the data were fit to the Rayleigh function to obtain the best-fit 
values for a and/or b, using a maximum likelihood approach. The 
results are shown in figure |6] While these parameterisations of 
the XMM position errors did improve the fit, particularly bring¬ 
ing the data in the tail closer to the expected Rayleigh curve, the 
fit remains poor overall, driving the peak of the data to « 0.7 
(it should peak at 1.0) and introducing a notable excess at .r < 1. 
Despite the fact these two forms of adjustment to the XMM po¬ 
sition errors yield statistically unacceptable fits to the Rayleigh 
function, as they do improve agreement in the tail (i.e. for a given 
XMM-counterpart pair, x' < x), they reduce the chance that real 
matches of 3XMM sources with counterparts from other cata¬ 
logues (or observations) will be erroneously excluded as candi¬ 
date counterparts. As such, although the position error column 
values in the 3XMM catalogues are not adjusted, we provide the 
values of a(=1.12) and b(-Q.21") for the two parameter fit so 
that users can apply the above adjustments to the XMM posi¬ 
tion errors if they wish. The one parameter case best fit yields 
b = 0.37. 

Other tests involved (i) imposing a lower bound on the XMM 
position error (crj^ = max(crxD, cr,„,„)) and (ii) including an off- 
axis-dependent systematic involving a scalar, c, (crj^ = cr\j^ + 
c^0^) where © is the off-axis angle. These latter modifications 
provide slightly better matches to the Rayleigh curve but still 
drive the peak of the data to x 0.7, again creating an excess at 
X < 1. A further test in which the XMM position error is defined 
as cr'j^ = a-\o for x < x, and crj^ = cTj^ + d^{x - x,)^ for x > x, 
(where d is a simple scalar and x, is a threshold value in x) does 
yield a marked enhancement in the likelihood for the fit but in 
this case, the data undershoot the Rayleigh curve at x > 2 and 
exceed it at 0.6 < x < 2. 

We conclude that while the more complex adjustments to the 
XMM position errors can formally improve the match between 
the error-normalised XMM-quasar separations and the Rayleigh 
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Fig. 6. Similar to figure|5]but comparing results that involve the simplest 
adjustments to the XMM position errors. For reference, the black his¬ 
togram is based on using the unadjusted XMM position errors while the 
expected Rayleigh distribution is overlaid (grey). The blue histogram 
represents the simplest adjustment to the XMM position errors, involv¬ 
ing the addition of a systematic in quadrature, h(=0.37), while the red 
histogram involves both a scaling of the XMM position error by a factor 
a(=1.12) and addition of a systematic, h(=0.27), in quadrature. These 
histograms are based on slightly different filtering compared to figure|5] 
as explained in the text 

curve, none provides a statistically acceptable match. Moreover, 
the cases that yield the best improvements in the fit likelihood 
have no compelling technical rationale. 

9.3. Background flare filtering 

As noted in section lT.2.31 an optimisation algorithm was adopted 
to determine the count rate threshold for defining the flare GTIs. 
This process was employed to maximise sensitivity to source de¬ 
tection and can come at the expense of reduced exposure time. 
Often, the new process results in GTIs that are similar to those 
derived from the fixed threshold cuts used in pre-cat9.0 pipeline 
processing. However, in some cases, significant improvements 
can be obtained in sensitivity. 

Of particular interest are cases where the background rises 
or falls slowly. In such cases, allowing a modest increase in the 
background count rate can yield a marked increase in exposure 
time, resulting in a significant improvement in the sensitivity to 
the detection of faint sources. A good example of this is illus¬ 
trated in figure |7] As is evident from the light curves, the op¬ 
timised cut threshold includes significantly more exposure time 
for a very modest increase in background level, producing a fac¬ 
tor 5.5 increase in the harvest of detected sources. 

Another aspect of the optimised flare filtering approach is 
that the increase in exposure time can result in exposures being 
used that were previously rejected in processing with pre-cat9.0 
pipelines. 

The pre-cat9.0 and cat9.0 light curves in figure [T] also high¬ 
light the fact that the change of energy band used can yield some 
significant differences in the strengths and even shapes of flare 
features in the data. 

The implementation of the optimised flare filtering approach 
was done in conjunction with some of the other upgrades, such 
as the use of the empirical PSF (see section [331) . As such, we 
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Fig. 7. An example of the improvement offered by the optimised back¬ 
ground flare filtering algorithm. Top panels: Left: high-energy MOSl 
background flare light curve created by the pre-cat9.0 pipeline, used for 
the 2XMMi-DR3 catalogue - the red line is the fixed (2 cts/s/arcmin^) 
count rate cut threshold applied. Right: in-band (0.5-7.5 keV) light 
curve used in the cat-9.0 pipeline used for 3XMM-DR4 and 3XMM- 
DR5 - the red line shows the optimised cut rate threshold derived for 
the light curve. The lower panels show the resulting, corresponding 
(smoothed) images, after filtering out the data above the respective rate- 
cut thresholds. Sources found by the source detection algorithm are in¬ 
dicated by red circles. 


have not directly isolated the impacts on source detection of the 
optimised flare filtering process alone. Nevertheless, compari¬ 
son of the numbers of source detections between the set of 4922 
observations that are common to the 2XMMi-DR3 and 3XMM- 
DR5 catalogues, indicates a net increase of 10047 detections in 
3XMM-DR5, i.e. a 2.9% increase. 


9.4. Extraction of spectral and time series products 

As described in section 14.11 spectra and time series of detec¬ 
tions are now extracted using optimised extraction apertures that 
are intended to maximise the overall S/N of the resulting prod¬ 
uct. To assess this, spectra were re-extracted for all detections 
and exposures for which spectra were produced during the bulk 
reprocessing, using a circular aperture of fixed radius (28") in 
each case, centred at the same location as the detection posi¬ 
tion used during the bulk reprocessing. Other than the change 
of aperture radius, processing was essentially identical to that 
used in the bulk reprocessing. The S/N, S , of each spectrum was 
then computed as S - CsjCj'-. Here Cs - Ct - Cb, where Ct 
is the total number of counts measured in the spectrum from the 
source aperture, Cs is the number of counts from the source in the 
source aperture and Cb is the number of background counts in the 
source aperture, the latter being estimated from the total counts 
in the background region, scaled by the ratio of source and back¬ 
ground region areas. Counts included in this analysis were drawn 
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Fig. 8. logio(S„/S f) plotted against the log of the total counts, Cj, mea¬ 
sured from the optimised aperture. The grey points indicate the data and 
include only clean (SUM_FLAG=0), point-like (EP_EXTENT =0) de¬ 
tections. The red line links measurements of the average logio(S„IS /•), 
in bins sampling the range in Ct, for cases where -1 < logio(So/S f) < 
1. The blue line is similar but is for the subset of data where, addi¬ 
tionally, the background rate is > 10“* cts s“' (sub-pixel)“^ (sub-pixels 
have side lengths of 0.05 "). The lower X-axis limit reflects the mini¬ 
mum threshold of 100 total counts in the optimised extraction aperture, 
imposed for extracting XMM source spectra; the plot is otherwise trun¬ 
cated for clarity. 
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from PHA channels with quality < 5 (in XSPEC terms). The S/N 
was computed in this way for the spectra from the optimised and 
fixed apertures - the spectral data used for background subtrac¬ 
tion were taken from the same background spectrum (from the 
bulk reprocessing) in each case and the background counts used 
were drawn only from the same channels as used for the source 
counts. 

In Figure 0 the log of the ratio of the S/N values from the 
spectra extracted from the optimised {So) and fixed {S /) aper¬ 
tures, i.e. logioiSolS /), is plotted against log/Cj-) from the opti¬ 
mised aperture, for MOS1 spectra. Only spectra from the clean¬ 
est {SUM_FLAG-Q), point-like {EP_EXTENT-Q) detections are 
included. 

It is evident from the positive asymmetry about 
logioiSolSf) = 0, that the optimisation procedure does 

improve the S/N of the spectra, especially for spectra with lower 
(Ct < 500) numbers of extracted counts, as expected. Overall, 
67.5% of the MOSl spectra with 100 < Cj < 50000 cts (within 
-1 < logioiSolSf) < 1 which excludes 21 positive outliers) 
have higher S/N in the optimised aperture than those extracted 
from the fixed apertures. The red line in figure 0 shows the 
average of logioiSolS /) of all the data as a function of Ct and 
indicates that spectra extracted from the optimised apertures 
with Ct = 100 cts have, on average, S/N values 12% higher 
than those extracted in the fixed apertures. It is anticipated 
that sources detected in fields with high background levels 
would benefit from the optimisation procedure. Indeed the blue 
line in figure |8] which reflects the subset of detections whose 
background levels are above 10“^ cts s“' (sub-pixel)'^ (i.e. 
amongst the highest 15% of background levels), demonstrates 
this - spectra of such detections extracted from optimised 
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(ln(Ref EXT3XM„)-ln(EXT3x„M))/sigma 


Fig. 10. Histogram of the logarithm of the ratio of extensions between 
the best observation and the other observations of the same source, nor¬ 
malised by the error. The solid red line is the best Gaussian fit to the 
histogram. The dashed red line is the expected mean (0). 
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apertures with Ct - 100 cts, have average S/N values 39% 
higher than the spectra from the corresponding fixed apertures. 


Fig. 11. Comparison of the ratio of extensions and the ratio of count 
rates obtained by the 3XMM and the XCLASS catalogues. The red solid 
line is the relation 1:1. 


9.5. Extended sources 

The detection and characterisation of extended source s for 
3XMM was performed as in 2XMM (IWatson et al.ll200^ . The 
caveats listed in section 9.9 of that paper still apply to 3XMM. 
However the better representation of the PSF has helped to im¬ 
prove extended source detection and characterisation. Many ex¬ 
tended sources with SC_SUM_FLAG = 4 in 2XMM now have 
SC_SUM_FLAG = 3 in 3XMM, indicating that the region is 
still complex but the detection itself is unlikely to be spurious. 
We have also looked at the distribution of e xtension likelihood 
vs. flux as in Fig. 15 of I Watson et al.l (l2009h . Fig. |9]shows that 
3XMM considers many bright extended sources to be reliable 
(SC_SUM_FLAG < 2) whereas in 2XMM most of them had 
higher flag values indicating more significant issues with the data 
quality. 

We have complemented this study by inter-comparing the 
3XMM (DR4) results when a source was observed more than 
once, and with an independent serendipitous search for clus¬ 
ters of galaxies. We restricted the comparison to the best-quality 
sources with SC_SUM_FLAG = 0. In 3XMM-DR4,667 sources 
have been observed several times as extended, each observa¬ 
tion being processed independently. We define as the “reference 
value” the extension (EP_EXTENT) associated to the detection 
with the highest likelihood value (EP_8_DET_ML column). We 
investigated the agreement of the extension parameter between 
the “reference" and the other observations of the same source. 
We ignored observations when a given source was not detected 
as extended (mostly because of insufficient exposure) or when 
the extension was set to 80" (maximum value allowed in the fit). 

In Eig. [To] we show the distribution (in log space) of the 
ratio between the “reference” extension Extref and the cur¬ 
rent one Extcur, normalised by the corresponding error equal 
to ^J{o-ref I Extref)^ + {(T cur IE xtcurf, where (Tref and (Tcur are the 


extension errors for the “reference” and current observation re¬ 
spectively. We fit the histogram result by a Gaussian function, 
obtaining a mean value equal to 0.512 (in cr units) with a stan¬ 
dard deviation equal to 1.943 (we would expect a mean of 0 and 
a standard deviation of 1 for random fluctuations). We conclude 
that there exists an additional scatter larger than statistical (of un¬ 
known origin) and that the reference observation, which is also 
the deepest one, estimates a larger extension on average. 

The XCLASS catalogue is based on the analysis of archival 
observations from the XMM-Newton observatory. The XCLASS 
team processed 2774 high Galactic latitude observations from 
the XMM archive (as of 2010 May) and extracted a serendip¬ 
itous catalogue of some 850 clusters of galaxies based on 
purely X-ray criteria, following the methodology deve loped 
for the XMM Large Scale Survey (iPierre et alT 120071) . We 
used the subsample of 422 galaxy clusters available online at 
http://xmm-lss.in2p3.fr:8080/14sdb/ to compare the extension 
and the count rate obtained for the same sources from the two 
different procedures (ie. the XCLASS and 3XMM processing). 
The analytic expression used to represent extended sources in 
XCLASS was the same as in 3XMM (j0-model with f5-2l3) so 
the numbers should be directly comparable. All 422 clusters are 
in 3XMM-DR4, but 59 (mostly faint or irregular objects) were 
classified as point sources. 

Eor the 363 extended sources in common, we compared the 
extent and the count rate in the [0.5-2.0] keV band obtained by 
3XMM and XCLASS. We found that, for both quantities, the 
3XMM estimates seem to be biased low with respect to the 
XCLASS values. The best fit regression on source extent re¬ 
sulted in a slope of 0.7 (Ext^xMM - 0-2ExtxcLASs)- Excluding 
clear outliers (difference of extension larger than 20", typically 
very faint sources or very bright sources affected by a strong 
pile-up) the slope increases to 0.85. We conclude that, even ex- 
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Fig. 9. Distribution of total flux and extension likelihood of all extended sources with SC_SUM_FLAG < 2, in the 2XMM (left) and 3XMM (right) 
catalogues. 


eluding these extreme sources, there remains a bias of ^ 15% 
between the extensions estimated by 3XMM and XCLASS. 

There exists a similar (a little smaller) bias on the count rate. 
However Fig. [TT] shows that there exists a close correlation be¬ 
tween both ratios, implying that only one parameter describes 
the difference in extent and count rates and that, if the source 
extents were forced to agree, the count rates would agree too. 
There is no obvious way to know whether the 3XMM or the 
XCLASS estimate is better but, together with the inter-3XMM 
comparison, this result indicates that the purely statistical exten¬ 
sion error underestimates the real error. 

10. Examples 

Thanks to the wide range of parameters provided in the cata¬ 
logue, sources matching specific criteria can be isolated (for ex¬ 
ample variability criteria of X-ray hardness ratios). In this sec¬ 
tion we show some examples of lightcurves (Fig. [T2]l and spec¬ 
tra (Fig. [nil extracted from the different EPIC cameras. The 
plots shown are those associated with the on-line catalogue. Both 
known and new sources are presented. It is immediately obvious 
from the two Figures that objects with extremely diverse charac¬ 
teristics are found. Variability on very different timescales is seen 
in Fig. im showing short and long flares, slow rises and steady 
declines in count rate as well as deep eclipses. From visual ex¬ 
amination of the strong variability in Fig. [12};, it was quickly ob¬ 
vious that this new X-ray source was a polar (Webb et al. to be 
submitted). Fig.fTSb shows a strong decline in flux, which, when 
coupled with the hard spectrum observed for this source, sug¬ 
gests that this might be a previously unknown orphan gamma- 
ray afterglow. 

The spectra shown in Fig. [T3] are also very varied and origi¬ 
nate from a variety of astrophysical objects, ranging from stars, 
compact objects, galaxies and clusters of galaxies. An unidenti- 
hed X-ray source is included in Fig.[T3k. which also has a highly 
variable lightcurve, showing a steady decline in count rate, but 
with a strong flare superposed. The nature of this source is not 
obvious and more work will be needed to identify its nature. The 


sources in the full 3XMM catalogue are of course dominated by 
unidentihed objects, emphasising the large discovery space pro¬ 
vided by the catalogue. 

11. Catalogue access 

The catalogue is provided in several formats. Firstly, a Flexi¬ 
ble Image Transport System (FITS) hie and a comma-separated 
values (CSV) hie is provided containing all of the detections 
in the catalogue. For 3XMM-DR5 there are 565962 rows and 
323 columns. A separate version of the catalogue (the slim cat¬ 
alogue) is also provided that contains only the unique sources, 
i.e. 396910 rows, and has 44 columns, essentially those con¬ 
taining information about the unique sources. This catalogue is 
also provided in FITS and CSV format. Ancillary tables to the 
catalogue also available from the XMM-Newton Survey Science 
Centre webpage^ include the table of observations incorporated 
in the catalogue and the target identiheation and classiheation 
table. 

The XMM-Newton Survey Science Centre webpages provide 
access to the 3XMM catalogue, as well as links to the different 
servers distributing the full range of catalogue products. These 
include, the XMM-Newton XSA, which provides access to all of 
the 3XMM data products, and the ODF data, the XCat-DHQ pro¬ 
duced and maintained by the XMM-Newton SSC, which contains 
possible EPIC source identiheation produced by the pipeline by 
querying 228 archival catalogues. Einding charts are also pro¬ 
vided for these possible identiheations. Other source properties 
as well as images, time series, spectra, ht results from the XMM- 
EITCAT are also provided. Multi-wavelength data taken as a 
part of the XID (X-ray identiheation project) run by the SSC 
over the hrst hfteen years of the mission are also provided in 
the XIDresult databas^E The LEDAS serveJ3 provides another 

^ http://xmmssc.irap.omp.eu/ 

’ http://xcatdb.unistra.fr/3xmm/ 

* http://xcatdb.unistra.fr/xidresult/ 

^ http://www.ledas.ac.uk/arnie5/arnie5.php?action=basic&catname=3xmm 
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Fig. 12. Example lightcurves taken directly from the 3XMM catalogue, a) 3XMM Jill 146.1-762010 = CHX 18N - T Tau-type star showing a 
short flare b) 3XMM J000619.5+201210, A Seyfert 1, Markarian 335. c) 3XMM J184916.1-t-652943, a new 1.6 hour polar (Webb et al. to be 
submitted), d) A 2MASS star (2MASS J00025638-3004447) showing two large flares, e) 3XMM J002159.4-r614254, a new X-ray object showing 
a strong decline in flux. Possibly a gamma-ray burst afterglow, f) 3XMM J013334.0-H3 03211, a high mas s X-ray binary in M 33, M33 X-7, 
showing a 12.5 hour eclipse - the first eclipsing stellar-mass black hole binary discovered dPietsch et al.ll200^ 


way to access the 3XMM catalogue and its products, whilst the 
upper limit served allows the user to specify a sky position 
and obtain upper limits on the EPIC fluxes of a point source at 
the position if the location has been observed by XMM-Newton 
but no source was detected. The catalogue can also be accessed 
through HEASARCO and VIZIERP^. The results of the exter¬ 
nal catalogue cross-correlation carried out for the 3XMM cata¬ 
logue (section |7]) are available as data products within the XSA 
and LEDAS or through the XCat-DB. The XMM-Newton Survey 
Science Centre webpages also detail how to provide feedback on 
the catalogue. 

Where the 3XMM catalogue is used for research and 
publications, please acknowledge their use by citing this paper 
and including the following; 


http://www.ledas.ac.uk/flix/flix.html 
'* http://heasarc.gsfc.nasa.gov/db-perl/W3Browse/w3table.pl? table- 
head=name%3Dxmmssc&Action=More-l-Options 
http://vizier.u-strasbg.fr/cgi-bin/VizieR 


This research has made use of data obtained from the 3XMM 
XMM-Newton serendipitous source catalogue compiled by the 
10 institutes of the XMM-Newton Survey Science Centre se¬ 
lected by ESA. 

12. Future catalogue updates 

Incremental releases (data releases) are planned to augment the 
3XMM catalogue. At least one additional year of data will be 
included with each data release. Data release 6 (DR6) will pro¬ 
vide data becoming public during 2014 and 2015 and should be 
released during 2016. These catalogues will be accessible as de¬ 
scribed in SectionfTTI 

13. Summary 

This paper presents the third major release of the XMM-Newton 
serendipitous source catalogue (3XMM), in its original version 
(3XMM-DR4) and in the first incremental version (3XMM- 
DR5). The 3XMM catalogues have been constructed by the 
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Fig. 13. Example spectra taken directly from the 3XMM catalogue, showing the diversity of the sources in the 3XMM catalogue. Energy (keV) 
is given on the abscissa and count s“* keV^' on the ordinate, a) 3XMM J052532.5+062533, an X-ray source of unknown nature, as the majority 
of the sources are in 3XMM b) 3XMM J123536.6-395433 a Seyfert 2 galaxy (NGC 4507) c) 3XMM J125141.9-H273226, a rotationally variable 
star, 31 Com d) 3XMM J162838.2+393303, a cluster of galaxies e) 3XMM JOl 1127.5-380500, the pn spectrum of NGC 424, a Seyfert 2 galaxy f) 
3XMM J185246.6+003317, a new transient magnetar discovered bv IZhou et al.l i2014h . Some low data points can be seen in the plots originating 
from pn data, but these are corrected for when the spectra are plotted in conjunction with the distributed response files. 


XMM-Newton Survey Science Centre and the 3XMM-DR5 cata¬ 
logue becomes the largest catalogue of X-ray sources detected 
using a single X-ray observatory. The characteristics and im¬ 
provements of this catalogue, with respect to previous versions, 
are outlined as well as how to cite and access the catalogue. This 
paper serves as the reference for future incremental versions of 
the same catalogue (3XMM-DR6, etc), as new XMM-Newton 
data becomes publicly available. 
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Appendix A: Known issues affecting 3XMM-DR4 
oniy 

- After the creation of the 3XMM-DR4 catalogue, it was dis¬ 
covered that the raw event files from the ODFs of a num¬ 
ber of mosaic mode sub-pointing observations contained 
corrupted data whereby some of the events in a given 
sub-pointing ODF were actually from another sub-pointing. 
Since the raw event positions are specified in detector coordi- 

http://heasarc.gsfc.nasa.gov/ftools/ 
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nates and are subsequently mapped to their sky locations dur¬ 
ing pipeline processing by reference to the observation bore- 
sight position, which is specified for the given sub-pointing, 
the celestial positions of these events are wrong and therefore 
results in some detections having incorrect celestial coordi¬ 
nates. The problem arose in the algorithm used to split the 
raw parent ODF into sub-pointing ODFs. In some cases all 
instruments were affected while in others, only one or both 
of the MOS instruments was affected. Of the 419 mosaic¬ 
mode sub-pointing observations included in 3XMM-DR4, 
82 are affected to some extent, involving 4918 detections. 
The affected observations are listed in the watchout section 
of the XMMSSC 3XMM-DR4 catalogue web pagefl For 
3XMM-DR5, none of the affected mosaic sub-pointing ob¬ 
servations is included in the catalogue. 

- The vignetting values provided in the 3XMM-DR4 catalogue 
(for each instrument, for bands 1 to 5) were found to have 
been computed for an energy of 0 keV rather than the en¬ 
ergy relevant to the band. Thus the values for each band of 
a given instrument are identical. This error does not affect 
the count rates or fluxes as the vignetting correction applied 
to them is computed separately and has been verified as cor¬ 
rect. It is only the tabulated values in the vignetting columns 
of the catalogue that are incorrect in 3XMM-DR4 and they 
are correct in 3XMM-DR5. 

- A significant issue identihed after the public release of the 
3XMM-DR4 catalogue relates to the error values on var¬ 
ious quantities. It was established that the error quantities 
(i.e. columns containing an _ERR at the end) for the XID 
band (band 9) count rates and fluxes of a significant num¬ 
ber (~42200) of detections (~10% of the catalogue) were 
substantially wrong (generally being overestimated by fac¬ 
tors up to ~ 100 but in a few cases, up to 1000). A more 
detailed investigation found that while all error columns are 
potentially affected (and therefore also any derived param¬ 
eters involving error-weighted quantities, such as some of 
the unique source quantities), the frequency and magnitude 
of the problem is much worse for the XID band data than 
any other parameter. It has been established that for other 
key quantities, such as the statistical positional uncertainty 
(RADEC_ERR) and the instrument count rates and fluxes in 
other (non-XID) bands, only about 1.3% of detections are af¬ 
fected and, generally, the scale of the problem is very small. 
For the positional uncertainty, 1.4% of detections have incor¬ 
rect RADEC_ERR values and only 0.26% of detections have 
position errors that differ from their correct values by more 
than 0.05" while for only 89 detections does it differ by more 
than 0.5" (of which, 58 are detected as extended sources and 
81 have a non-zero quality flag). Furthermore, for 81% of 
those detections where the position error is wrong by more 
than +0.05", the correct position error is smaller than that 
quoted in the 3XMM-DR4 catalogue. The most extreme de¬ 
viations of the RADEC_ERR values from their correct values 
are 32" larger and 2.3" smaller. For the PN band 2 flux er¬ 
rors, only ~1.1% of detections have values that deviate from 
their correct values by more than 10“^, when expressed as a 
fraction of the correct value. For the errors on the XID band 
photometric quantities (rates, fluxes, counts) the correct error 
is generally smaller than that given in 3XMM-DR4. 

Thus, while there is a significant problem with the error 
quantities on the XID band photometric data in 3XMM-DR4, 


http://xmmssc-www.star.le.ac.uk/Catalogue/xcat_public_3XMM- 

DR4.html 


Table B.l. Data modes of XMM-Newton exposures included in the 
3XMM catalogue. 


Abbr. 

Designation 

Description 

MOS cameras: 


PFW 

Prime Full Window 

covering full FOV 

PPW2 

Prime Partial W2 

small central window 

PPW3 

Prime Partial W3 

large central window 

PPW4 

Prime Partial W4 

small central window 

PPW5 

Prime Partial W5 

large central window 

FU 

Fast Uncompressed 

central CCD in timing mode 

RFS 

Prime Partial RFS 

central CCD with different frame 
time (‘Refreshed Frame Store’) 

pn camera: 


PFWE 

Prime Full Window 
Extended 

covering full FOV 

PFW 

Prime Full Window 

covering full FOV 

PLW 

Prime Large Window 

half the height of PFW/PFWE 


the problem is much less severe for other quantities. It is 
emphasised that the correct error quantities are present in 
3XMM-DR5. 


Appendix B: Data modes of XMM-Newton 
exposures included in the 3XMM catalogue. 

The data modes are described in Table IB. ll 


Appendix C: Definitions relating to 3XMM-DR5 
detections referred to in this work 

We describe here some of the important quantities relating to 
3XMM-DR5 detections that are frequently referred to in the pa¬ 
per. 

RADEC_ERR is the statistical position error, dehned as 
{ra_err^+dec_err^y^^, where ra_err and dec_err are the 1-sigma 
errors in the RA and DEC coordinate directions, respectively, de¬ 
termined during the fitting of the PSF to the source image 

SYSERRCC is the estimated 1-sigma error from the rectifica¬ 
tion process, as defined by equation 3 in section U.4.21 

POSERR is the error representing the quadrature combina¬ 
tion of RADEC_ERR and SYSERRCC, i.e. (RADEC_ERR^ + 
SYSERRCC^y^. 

Count rates for detections are given in the <ca>_<b>_RATE 
columns in the catalogue for EPIC camera, <ca>, in energy band 
<b>, for bands 1-5 and 9. These are the total integrated counts 
for the detection, derived from within the whole PSP htted to the 
source image, divided by the exposure map value at the source 
position. The count rate values are corrected to the rate on-axis 
position. The band-8 rates in each camera are the sum over bands 
1-5. 

Pluxes are provided in the <ca>_<b>_PLUX columns. 
These are converted from the count rates via energy conversion 
factors (ECPs) (see table 2), assuming a power-law spectrum 
with Nh = 3 X 10^° cm“^ and a power law photon index of 1.7. 

The summary flags, in the SUM_FLAG column, provide a 
simple overview of the quality of the detection, based on a com¬ 
bination of the automatic flags and flags set during manual (vi¬ 
sual) screening. Values are: 0 - identihes the best quality detec¬ 
tions, i.e. those with no evident complicating factors; 1 - detec¬ 
tions where the source parameters may be affected; 2 - cases 
where the automatic analysis suggests the detection may be a 
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spurious extended source or associated with known detector fea¬ 
tures but is not flagged by manual screening; 3 - cases where 
manual screening has flagged the detection but automatic flags 
are not set; 4 - detections where both automatic and manual 
screening flags are set. 


Appendix D: Detailed description of the issue 
known as ’missing sources’ 

Of the ~ 25700 missing 3XMM detections, up to 8% are found 
only in the pn band-1 data. Visual inspection of examples and 
analysis of the pn detector-image data suggests many of these 
are probably previously unrecognised MIP features, i.e. spuri¬ 
ous detections, in 2XMMi-DR3 (see section U.2.21 l. though some 
may well be real, soft sources. A second, difficult-to-quantify 
percentage (but <7%) of the missing 3XMM detections may 
have detected counterparts in the 3XMM catalogues but be un¬ 
matched within 10" due to imperfect astrometry in either the 
2XMMi-DR3 and/or 3XMM catalogue. A third component of 
up to around 3% of the missing 3XMM detections may be detec¬ 
tions in 2XMMi-DR3 that are associated with hitherto unrecog¬ 
nised/unflagged detector features - such features become appar¬ 
ent when the missing 3XMM detections are plotted in detector 
coordinates for each EPIC instrument, after allowing for likely 
real detections in the same regions that are detected in more than 
1 instrument. 

Other explanations for the missing 3XMM detections in¬ 
clude 


- A small number (<1%) are pairs of visually verified close 
sources that were separated in 2XMMi-DR3 but found as 
either a single extended or a single unresolved point source 
in 3XMM. 

- A small number of cases are likely spurious detections in 
the wings of bright sources in 2XMMi-DR3 that were not 
flagged during the manual screening process for 2XMMi- 
DR3 and were not detected in 3XMM. 


The above-mentioned explanations account for <20% of 
all the clean, point-like missing 3XMM detections. Some 75% 
of the missing 3XMM detections have EPIC likelihoods in 
2XMMi-DR3, L, < 10 (90% have L < 15). It might be thought 
that the missing 3XMM detections could arise from spurious de¬ 
tections due to random statistical background fluctuations (false 
positives) in 2XMMi-DR3 - the numbers of such detections, 
estimated from sim ulations, was discussed in section 9.4 of 
IWatson et al.l ( 2009ll. Using the cu mulative count rates presented 
in Pig. 10 of IWatson et al.l (l2009h and the distribution of expo¬ 
sure times for observations in 2XMMi-DR3, we estimate around 
7500 detections in the common observations might be false pos¬ 
itives. This, however, is probably an overestimate of the contri¬ 
bution of false positives to the missing sources because although 
there are notable changes to the pipeline processing between the 
2XMMi-DR3 and 3XMM catalogues, the input ODPs and asso¬ 
ciated event data are often the same for the common observa¬ 
tions, i.e. the data are not independent. It should be noted that of 
the ~25700 missing 3XMM-DR5 detections, ~5200 of them be¬ 
long to unique sources that comprise at least one other 2XMM- 
DR3 detection, hinting that at least 20% are probably real. 

The distributions of the band-8 likelihood for clean, point¬ 
like EPIC detections found in 3XMM-DR5 and not in 2XMMi- 
DR3 (and vice versa) are very similar and strongly biased to low 
(6 < L < 10) likelihood values. Both are much more strongly 
concentrated in this range than the distribution of all clean, point¬ 
like detections. Evidently, the issue of the ’missing’ detections is 


primarily related to changes affecting detections with likelihoods 
near the L - 6 threshold used for the catalogue. 

It is clear that two of the major improvements to the 3XMM 
catalogue with respect to the 2XMM catalogue, which are the 
new empirical PSP, described in Sec [33] and the optimised flare 
Altering (see Sec l3.2.3l l. could have an impact on the detection 
likelihoods and hence the numbers of detected sources. We in¬ 
vestigated the impact of these two improvements. Optimising the 
flare Altering maximises the signal to noise ratio of the sources 
but also affects the background level, as described in Sec l3.2.3l 
To explore the impact of changing the background, we scaled 
the 3XMM-DR5 background maps around their original values 
- raised background model values from 3XMM-DR5 images at 
the positions of faint sources could reduce the detection like¬ 
lihood below the threshold of 6. Our analysis was limited to 
scaling the entire original background map for each available 
instrument by a common factor (in steps of 2% between 90% 
and 110%). While this is not adequately representative of real 
background variations between processings, which would vary 
across the held (see below), it helps to illustrate the potential 
effects that may occur. 

Prom a subset of 1854 fields, we And up to ~9700 extra de¬ 
tections may appear if the background is systematically underes¬ 
timated by 10% and ~6800 fewer detections appear if the back¬ 
ground is 10% higher than the original level. However, as noted, 
the differences in the background maps between the 2XMMi- 
DR3 and 3XMM processings are much more likely to occur on 
a spatially localised scale rather than a uniform change across 
the held of view. To look for indicators that this might be the 
case, we computed ratios of the background maps (3XMM- 
DR5/2XMMi-DR3) in each instrument and band and looked for 
deviations of the ratio (relative to the mean of the ratio image) 
at the positions of 2XMMi-DR3 detections that are missing in 
3XMM-DR5. We observe a spread of up to +20% in the devi¬ 
ations of the ratio at some source positions but no evidence of 
systematic background over-estimation in a specific instrument 
or band. Nevertheless, background enhancements that push the 
EPIC band 8 detection likelihood below 6 could be arising in a 
different instrument and/or energy band in each case. 

The second effect of the improved flare Altering is the im¬ 
pact on the good time intervals. We investigated the relation be¬ 
tween exposure time, the number of counts in the source (count 
number) and detection likelihood, using only the pn parame¬ 
ters of detections that are in both 2XMMi-DR3 and 3XMM- 
DR5 (i.e. whose EPIC detection likelihood is > 6 in each cat¬ 
alogue). We expect that we would see a similar relation for the 
combined instrument (EPIC) source parameters if we could in¬ 
clude detections with EPIC likelihoods below six in the cata¬ 
logues (which, by definition, we don’t have). Pig. IDT] show the 
ratio of pn source count numbers (DR5/DR3) plotted against the 
corresponding ratio of median pn exposure times. The data in¬ 
clude only point sources with SUM_FLAG < 1 that are isolated 
and not affected by nearby extended sources, in both catalogues. 
The red points reflect sources whose pn detection likelihood in 
3XMM-DR5, Lp„^£)R 5 ), is > 6 while their pn detection likelihood 
in 2XMMi-DR3, Lp„^oR^), is < 6 - these are detections that, based 
on their pn data, would be present in 3XMM-DR5 and not in 
2XMMi-DR3. The blue points represent data where Lpn(DR 5 ) < 6 
and Lpn^DRi) > 6, i.e. which would be in 2XMMi-DR3 and not 
in 3XMM-DR5. 

To investigate the impact of changing the PSP to the empir¬ 
ical model, we reprocessed the 4921 fields that are common to 
3XMM-DR5 and 2XMMi-DR3 using the same source-detection 
steps of the pipeline, input data and calibration files that were 
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Fig. D.l. Ratio of the pn source count numbers (DR5/DR3) plotted 
against the corresponding ratio of median pn exposure times. Red filled 
points are pn detection likelihood in 3XMM-DR5, Lp„foR 5 ) > 6 and pn 
detection likelihood in 2XMMi-DR3, L;,„(dr 3 ) < 6. The blue open points 
are Lp„^oR 5 ) < 6 and LpniDRi) > 6. 
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Fig. D.2. The maximum likelihood (ML) values of all the sources com¬ 
mon to 3XMM-DR5 and 2XMMi-DR3. On the ordinate are the ML 
values in 3XMM-DR5 and on the abscissa, the ML values in 2XMMi- 
DR3. The (green) solid diagonal line indicates where the 3XMM-DR5 
and 2XMMi-DR3 ML values are equal. 


employed to create 3XMM-DR5, but changing the new empir¬ 
ical PSF model to the previous ’medium-accuracy’ model. We 
found some 8300 clean, point-like 3XMM-DR5 detections have 
no matching detection obtained with the medium-accuracy PSF 
within 5" (7100 within 10"), demonstrating that the new PSF 
has a non-negligible effect on the source detection. It should also 
be pointed out that emldetect gives the most re liable results when 
the PS F model used is similar to the true PSF dFeigelson & BabiJ 
I2OO6I) . implying that the empirical PSF, constructed from ob¬ 
served source data, should provide more reliable sources. Chang¬ 
ing both the background and PSF therefore has an impact on 
the maximum likelihood determined. Fig. ID. 21 shows the rela¬ 
tionship between the maximum likelihood (ML) in 3XMM-DR5 
and 2XMMi-DR3 for all sources common to both catalogues. 
More than half the sources have a higher maximum likelihood 
in 3XMM-DR5 compared to 2XMMi-DR3, indicating generally 
better sensitivity in 3XMM-DR5. It should also be noted that, 
as indicated in the emldetect descriptiorQ the maximum likeli¬ 
hood values provide only a rough estimate of the n umber of ex - 
pected spurious sources for low count sources (< 9. ICas 31971. 
As many as 10% of the catalogue sources have counts < 9 in at 
least one instrument, so sources with a low count rate may have 
an inaccurate likelihood value attributed. 

The changes to the pipeline generating the catalogue sources 
mean that the maximum likelihood value attributed to each 
source varies from catalogue to catalogue. The dispersion is high 
(~2 in ML) for the distribution of ML values in one catalogue, 
given a specific ML value in the other catalogue. Given that we 
have chosen a threshold of ML > 6 to indicate a real source, 
many sources with a ML close to this value in one catalogue will 
have an ML < 6 in the other, due to this broad dispersion. In¬ 
deed, as many as 10000 sources can be found below this thresh¬ 
old in the other catalogue (2XMMi-DR3 when comparing with 
3XMM-DR5 or 3XMM-DR5 when comparing with 2XMMi- 
DR3), and are therefore considered as missing when considering 

http://xmm.esac.esa.int/sas/current/doc/emldetect/node3.html 


one catalogue over the other. In reality, it is simply that the ML 
value has fallen slightly below our chosen threshold and there¬ 
fore the source is just not included in the catalogue. 

In conclusion, the main reason for the missing sources is 
that the changes in the pipeline processing procedure between 
2XMMi-DR3 and 3XMM-DR5 have produced slightly differ¬ 
ent likelihood values (of mostly real sources), so that detections 
near the likelihood threshold cross the boundary, in both direc¬ 
tions, between catalogues, resulting in different lists of detec¬ 
tions. Overall, however, the 3XMM-DR5 procedure is better, re¬ 
sulting in more sources. 

Appendix E: Astrometry and the deviation from the 
Rayieigh distribution 

To explore the cause(s) of the deviations from the Rayleigh 
curve, we first examined whether the outlier pairs in the tail 
excess might be spurious XM M-quasar associations, though as 
noted bv lWatson et akl (l2009l) . the false match rate for quasars is 
expected to yield far fewer mismatches than the numbers found 
in the tail excess. To test this possibility we compared the dis¬ 
tribution of the 3XMM-DR5 EPIC band 8 flux {Fx) to SDSS (r 
band) flux {Fp) ratio (i.e. FxIFp) for XMM-quasar pairs from 
.X > 3.5 (the region of the excess tail where the Rayleigh func¬ 
tion predicts negligible numbers) to that from x < 0.8 (where 
the data and model match well). While pairs from the tail do 
have a slightly (25%) higher FxIFp ratio on average than those 
at X < 0.8 it is too small a difference to be explained by sys¬ 
tematic mismatching. This conclusion is supported by consid¬ 
ering XMM-quasar pairs in the tail whose X-ray detections be¬ 
long to 3XMM-DR5 unique sources that include one or more 
other X-ray detections with a quasar counterpart. Amongst 104 
such unique sources involving an XMM-quasar pair from the 
tail, only 13 are cases where all other constituent XMM-quasar 
pairs have x > 3.5. These 13 cases might reflect mismatches of 
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the XMM detections and quasars. However, the X-ray detections 
involved represent only 8% of the X-ray detections with quasar 
counterparts that make up the 104 unique sources, suggesting 
most of the XMM-quasar pairs from the tail are not mismatches. 

We then constructed distributions for many XMM catalogue 
parameters (e.g. position errors, off-axis angle, count rates, equa¬ 
torial and galactic location, exposure times, nearest-neighbour 
distance etc.), comparing the distributions of the data subsets 
from .r > 3.5 and x < 0.8. The position error (POSERR) dis¬ 
tributions of the two subsets are very similar while the XMM- 
quasar separations are markedly different, having an average of 
0.4" for data with x < 0.8 compared to 5.5" for the x > 3.5 sub¬ 
set. There is an indication that the points at x > 3.5 tend to lie at 
larger off-axis angles. No other trends could be discerned from 
the distributions for other parameters. To push this further, we 
also cross-matched the 6614 3XMM-DR5 detection s with SPSS 
quasar counterparts against the Chandra catalogue dEvans et al.l 
1201 4ah . Within 10", 745 XMM detections have one or more 
matches with Chandra detections - we retained only the nearest 
match in the few instances where multiple matches were present. 
The 3XMM-DR5 detections from the tail do tend to be notably 
more offset from their Chandra counterparts compared to those 
detections with x < 3.5. Furthermore, although numbers are 
more limited, for the XMM-quasar pairs in the tail with Chan¬ 
dra counterparts, the error-normalised offsets between the Chan¬ 
dra detections and the SDSS quasars appear to provide a better 
match to the Rayleigh distribution - there is no evidence of a 
similar tail excess. This hints at the positions of the 3XMM-DR5 
detections in the tail being incorrect. While, alternatively, their 
position errors may be underestimated, if so, there is no clear 
evidence the errors are being systematically underestimated, e.g. 
being incorrectly characterised with off-axis angle. It is worth 
noting that while the proportion is lower in the central regions of 
the field of view, even in the 10' < 0 < 12' annulus {6 is the EPIC 
off-axis angle), 5.6% of the XMM-quasar pairs have x > 3.5 - 
this demonstrates there is not a generic problem with sources at 
higher off-axis angles. 

As noted, there is an indication that XMM sources at x > 3.5 
tend to lie at higher off-axis angles in the field than those from 
lower X values. This is illustrated in figure|5]where, alongside the 
histogram of error-normalised offsets for all the XMM-quasar 
pairs (red), we show the histograms for data from off-axis angles 
0 < 5' (blue), 5' < 0 < 10' (green) and 10' < 0 < 15' (grey). 
For sources near the centre of the field, the distribution peaks 
too early, at x = Ar/cr,^, ~ 0.65 but better matches the tail at 
X > 2.5. Conversely, data from 10' < 0 < 15' peak near x = 1 
but account for much of the excess tail. We examined whether 
this could arise from, for example, an error in the rotation cor¬ 
rection of the rectification process (see section 13.4.11) in some 
observations. If so, for a given field, one might anticipate the 
quasar counterparts having a systematic offset, either ahead of, 
or behind, the X-ray detection, in a sector oriented perpendicu¬ 
lar to the radial vector, r, from the field boresight to the X-ray 
position. This is not, however, evident in fields that contain use¬ 
ful numbers (up to 22) of XMM-quasar pairs, one or more of 
which come from x > 3.5. We also performed a more detailed 
analysis in which the circularised statistical XMM position er¬ 
ror, RADEC_ERR, used previously was replaced with an error 
derived from an error ellipse: an elliptical error contour should 
better characterise positional uncertainties arising from the elon¬ 
gated PSF profiles that become evident at larger off-axis angles. 
Assuming the ellipse is oriented with the major axis tangential 
to r, the mean geometry of the error ellipse as a function of off- 
axis angle was obtained via the separate errors in RA and DEC 


of all 3XMM-DR5 detections (available in the initial emldetect 
source lists, though only the circularised RADEC_ERR value is 
provided in the final observation summary source lists). For each 
XMM-quasar pair, the idealised error ellipse of the X-ray source 
was scaled to the measured RA and DEC errors and a mean of 
the major and minor axes was obtained. Using the mean ellipti¬ 
cal positional uncertainty to normalise the XMM-quasar offsets, 
even when combined with the elliptical errors for the rectifica¬ 
tion correction and the elliptical quasar position errors, still re¬ 
sults in a notable excess at x > 3.5. 

We conclude that the excess of 3XMM-DR5 detections with 
error-normalised offsets from their SDSS quasar counterparts 
>3.5 appears to have a modest dependence on the off-axis lo¬ 
cation of the detection in the XMM field of view, with a small 
fraction of detections at higher off-axis angles having either in¬ 
correct positions or underestimated errors, while sources near 
the centre may have slightly overestimated errors. However, no 
systematic cause is identified. 
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